Is there some overhead of using variable-length arrays? Could the size of array be passed via command line argument at run time? Why is it introduced, compared to automatic
There is some run-time overhead with variable-length arrays, but you would have to be working fairly hard to measure it. Note that sizeof(vla)
is not a compile-time constant if vla
is a variable-length array.
The size of the array can be passed to a function at run-time. If you choose to take the size from a command line argument and convert that into an integer and pass that to the function at run-time, so be it -- it will work.
Variable-length arrays are used because the variables are automatically allocated to the correct size and automatically freed on exit from the function. This avoids over-allocating space (allocating enough space for the maximum possible size when you mostly work with minimal sizes), and avoids problems with memory clean up.
Additionally, with multi-dimensional arrays, AFAIK it behaves more like Fortran - you can dynamically configure all the dimensions, rather than being stuck with fixed sizes for all but the leading dimension of the array.
Concrete evidence of some run-time overhead for VLA - at least with GCC 4.4.2 on SPARC (Solaris 10).
Consider the two files below:
#include <assert.h>
#include <stddef.h>
extern size_t identity_matrix(int n, int m);
size_t identity_matrix(int n, int m)
{
int vla[n][m];
int i, j;
assert(n > 0 && n <= 32);
assert(m > 0 && m <= 32);
for (i = 0; i < n; i++)
{
for (j = 0; j < m; j++)
{
vla[i][j] = 0;
}
vla[i][i] = 1;
}
return(sizeof(vla));
}
#include <assert.h>
#include <stddef.h>
extern size_t identity_matrix(int n, int m);
size_t identity_matrix(int n, int m)
{
int fla[32][32];
int i, j;
assert(n > 0 && n <= 32);
assert(m > 0 && m <= 32);
for (i = 0; i < n; i++)
{
for (j = 0; j < m; j++)
{
fla[i][j] = 0;
}
fla[i][i] = 1;
}
return(sizeof(fla));
}
For comparison purposes, the names of the local array are different (vla
vs fla
), and the dimensions on the array are different when it is declared - otherwise, the files are the same.
I compiled using:
$ gcc -O2 -c -std=c99 fla.c vla.c
The object file sizes are somewhat different - as measured both by 'ls' and by 'size':
$ ls -l fla.o vla.o
-rw-r--r-- 1 jleffler rd 1036 Jan 9 12:13 fla.o
-rw-r--r-- 1 jleffler rd 1176 Jan 9 12:13 vla.o
$ size fla.o vla.o
fla.o: 530 + 0 + 0 = 530
vla.o: 670 + 0 + 0 = 670
I've not done extensive testing to see how much of the overhead is fixed and how much is variable, but there is overhead in using a VLA.
There should be very little overhead for VLAs (At most it should result in an addition to the stack pointer). Dynamic allocation requires manual memory management and is slower than stack-based allocation of a VLA, and "automatic" declaration of an array requires a compile-time expression for the array size. However, keep in mind that if a stack overflow occurs, it will cause undefined behavior, so keep VLAs relatively small.
You could pass the size of an array via a command-line argument, but you would have to write the code to handle that yourself.
I just wonder if there is some overhead of using variable-length arrays?
Nope
Can the size of array could be passed via command line argument at run time?
Yes.
Why is it introduced, compared to automatic and dynamically allocating an array?
Automatic allocated only allows a fixed size known at compile time.
Dynamically allocating (malloc
) will store the array on the heap, which has a large memory space, but is slower to access.
VLA works by placing the array in the stack. This makes allocation and access extremely fast, but the stack is usually small (of a few KB), and when the VLA overflowed the stack, it's indistinguishable from an infinite recursion.
VLA does have some overhead (compared to "ordinary" named compile-time-sized array).
Firstly, it has run-time length and yet the language provides you with means to obtain the actual size of the array at run-time (using sizeof
). This immediately means that the actual size of the array has to be stored somewhere. This results in some insignificant per-array memory overhead. However, since VLAs can only be declared as automatic objects, this memory overhead is not something anyone would ever notice. It is just like declaring an extra local variable of integral type.
Secondly, VLA is normally allocated on stack, but because of its variable size, in general case its exact location in memory is not known at compile time. For this reason the underlying implementation usually has to implement it as a pointer to a memory block. This introduces some additional memory overhead (for the pointer), which is again completely insignificant for the reasons described above. This also introduces slight performance overhead, since we have to read the pointer value in order to find the actual array. This is the same overhead you get when accessing malloc
-ed arrays (and don't get with the named compile-time-sized arrays).
Since the size of the VLA is a run-time integer value, it can, of course, be passed as a command-line argument. VLA doesn't care where its size comes from.
VLA were introduced as run-time-sized arrays with low allocation/deallocation cost. They fit between "ordinary" named compile-time-sized arrays (which have virtually zero allocation-deallocation cost, but fixed size) and malloc
-ed arrays (which have run-time size, but relatively high allocation-deallocation cost).
VLA obey [almost] the same scope-dependent lifetime rules as automatic (i.e local) objects, which means that in general case they can't replace malloc
-ed arrays. They applicability is limited to situations when you need a quick run-time sized array with a typical automatic lifetime.