calloc() vs. malloc()

The question:

Suppose I wished to initialize a dynamically allocated array of integers to zero. Would I do better to use calloc() or malloc + iterate over all entries setting each to zero? Which one is regarded as a better approach?

Answer:

If you need the dynamically allocated memory to be zero-initialized then use calloc.

If you don’t need the dynamically allocated memory to be zero-initialized, then use malloc.

You don’t always need zero-initialized memory; if you don’t need the memory zero-initialized, don’t pay the cost of initializing it. For example, if you allocate memory and then immediately copy data to fill the allocated memory, there’s no reason whatsoever to perform zero-initialization.

calloc and malloc are functions that do different things: use whichever one is most appropriate for the task you need to accomplish.

However one should notice that: Calloc() should work if you want to zero initialize an array of ints. The real so-called problem with calloc() is that it will initialize all bits to zero, which is not necessarily the zero value for every data type. (thanks Whiteflags for the quote).

Another interesting answer come from Nomimal Animal:
When a process asks more memory from a kernel via an anonymous memory map, most systems guarantee it’s already cleared to zero. A C library can use that to only clear — using a fast, optimized memset() variant for just this purpose — parts it knows might be dirtied, and avoid “clearing” those already-cleared pages.

The larger the allocated area, the more beneficial it is (on some architectures, not necessarily on all) to use calloc() instead of malloc()+memset().

(If you are working in the Linux world, and you or your library locks memory pages, there was a bug in the GNU C library, where it forgot to check if anonymously mapped pages were actually cleared or not. Sometimes the data was not really cleared to zeroes! This only affects the cases where the application/library locked at least some pages in memory using mlock() or mlockall(), and it’s fixed in glibc 2.8.)

If you wish to “clear” the array to a predefined entry (not just all bits zero), it is best to use malloc(), populate the initial entry of the array, and then use e.g.

void memfill(void *const array, const size_t size, const size_t count)
{
    const size_t full = size * count;
    const size_t half = size * count / 2;
    size_t  have = size;
 
    while (have <= half) {
        memmove((char *)array + have, array, have);
        have *= 2;
    }
 
    if (have < full)
        memmove((char *)array + have, array, full - have);
}

which copies the initial size bytes (the first entry) to all count entries in the array, no matter what the entries really contain. It works by repeatedly copying the filled part of the array to the rest of the array. It does log2 count passes, and due to cache effects is not the fastest one there is, but it is fast and robust on all architectures.

Funnily enough, on most architectures a function that works like memcpy() but copies overlapping regions in exactly the opposite order memmove() does — so almost a duplicate of memmove(), just pass direction reversed! –, is the fastest one. But, neither the C standard nor any C library I know of provide that function. If they did, certain benchmarks which show that “Fortran code is faster than C code”, would be finally fixed with standard C. Most Fortran compilers, you see, use something pretty much exactly that to fill arrays (slices) with a constant value, and it is much faster than copying individual values.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s