I have initialized a 2D array like a 1D array:
int a[2][3] = {1,2,3,4,5}
How are the values stored in this array?
They are assigned as follows:
1 2 3
4 5 0
The zero is because you've allocated an array of size 6, but only specified 5 elements.
This is called "row-major order".
You may wish to formalize your code slightly. Your code is currently:
int a[2][3] = {1,2,3,4,5};
If you compile this with gcc main.c -Wall -pedantic --std=c99
, you'll get a few warnings:
temp.c:2:17: warning: missing braces around initializer [-Wmissing-braces]
Resolve this using
int a[2][3] = {{1,2,3,4,5}};
This will give you a new warning:
temp.c:2:25: warning: excess elements in array initializer
Resolve this using:
int a[2][3] = {{1,2,3},{4,5,0}};
This explicitly represents the data as having two rows of three elements each.
Some thoughts on memory layout
int a[2][3]
will produce an "array of arrays". This is similar to, but contradistinct from, an "array of pointers to arrays". Both have similar access syntax (e.g. a[1][2]
). But only for the "array of arrays" can you reliably access elements using a+y*WIDTH+x
.
Some code might clarify:
#include <stdlib.h>
#include <stdio.h>
void PrintArray1D(int* a){
for(int i=0;i<6;i++)
printf("%d ",a[i]);
printf("\n");
}
int main(){
//Construct a two dimensional array
int a[2][3] = {{1,2,3},{4,5,6}};
//Construct an array of arrays
int* b[2];
b[0] = calloc(3,sizeof(int));
b[1] = calloc(3,sizeof(int));
//Initialize the array of arrays
for(int y=0;y<2;y++)
for(int x=0;x<3;x++)
b[y][x] = a[y][x];
PrintArray1D(a[0]);
PrintArray1D(b[0]);
}
When you run this, you get:
1 2 3 4 5 6
1 2 3 0 0 0
Printing b
gives zeros (on my machine) because it runs into uninitialized memory. The upshot is that using contiguous memory allows you to do handy things, let set all of the values without needing a double loop.
In C, 1D arrays are stored in a single linear buffer in memory in what is called "row major" order. Row major means that the last index varies fastest as you go from element to element. Column major would mean that the first index varies fastest, as it does in MATLAB, for example.
The array you declared is only 2D in the sense that the compiler helps you out by computing the linear address of the elements for you. The address of an element in a 1D array is computed linear[x] = linear + x
. Similarly, for your 2D array, a[y][x] = a + 3 * y + x
. In general, a[y][x] = a + num_cols * y + x
.
You can initialize the array as a single vector of elements, which will first fill the first row, then the second, and so on. Since you have two rows of three elements each, the first row becomes 1, 2, 3
and the second row becomes 4, 5, 0
.
Indexing past the end of a row is perfectly valid, as far as the compiler is concerned at least. In the example you give, a[0][3]
is accessing the fourth element of the first row in an array that is three elements wide. With wrap-around, you can see that this is just the first element of the second row, which is more explicitly stated as a[1][0]
.
Because of the lax index checking, you can completely omit the first index in any array as long as you provide an initializer. The formula to compute the linear address does not depend on the first index (because it is row major), and the total number of elements is specified by the initializer itself. A 1D example is int linear[] = {1, 2, 3};
.
Keep in mind that the name of the array also refers to the pointer to its first element. These are two different things that can be accessed by the same name.
From the definition of how an access to a 2D-array like a[1][2]
is interpreted, "It follows from this that arrays are stored in row-major order" (cf, for example, this online C standard comitee draft / array subscripting).
This means that for an array int a[ROWS][COLUMNS]
for an access a[r][c]
the offset in terms of int values is calculated like (r*COLUMNS + c)
.
So for an array int a[2][3]
, an access a[0][1]
has offset 0*3 + 1 = 1
, and an access a[1][0]
has an offset 1*3 + 0 = 3
. That said, a[0][3]
might lead to offset 3
, while a[1][0]
for sure leads to 3
. I wrote "might" because I think that accessing an array int a[2][3]
with a[0][3]
is undefined behaviour, as the range of the last subscript is 0..2
. So according to 6.5.6 (8), expression a[0][3]
is addressing the sub-array a[0]
out of its bounds as argued, for example, here.
Now to the thing of how int a[2][3] = {1,2,3,4,5}
is interpreted. This statement is initialization as defined in section 6.7.9 of this online C standard comitee draft, and paragraphs (20) to (26) describe the things needed here:
(20) If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions. If the initializer of a subaggregate or contained union begins with a left brace, the initializers enclosed by that brace and its matching right brace initialize the elements or members of the subaggregate or the contained union. Otherwise, only enough initializers from the list are taken to account for the elements or members of the subaggregate or the first member of the contained union; any remaining initializers are left to initialize the next element or member of the aggregate of which the current subaggregate or contained union is a part.
(21) If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
26 EXAMPLE
(3) The declaration
int y[4][3] = { { 1, 3, 5 }, { 2, 4, 6 }, { 3, 5, 7 }, };
is a definition with a fully bracketed initialization:
1
,3
, and5
initialize the first row ofy
(the array objecty[0]
), namelyy[0][0]
,y[0][1]
, andy[0][2]
. Likewise the next two lines initializey[1]
andy[2]
. The initializer ends early, soy[3]
is initialized with zeros. Precisely the same effect could have been achieved byint y[4][3] = { 1, 3, 5, 2, 4, 6, 3, 5, 7 };
The initializer for
y[0]
does not begin with a left brace, so three items from the list are used. Likewise the next three are taken successively fory[1]
andy[2]
.