I just start learning C and found some confusion about the string pointer and string(array of char). Can anyone help me to clear my head a bit?
// source code
ch
When you say
char *name3 = "Apple";
you are declaring name3
to point to the static string "Apple"
. If you're familiar with higher-level languages, you can think of this as immutable (I'm going to explain it in this context because it sounds like you've programmed before; for the technical rationale, check the C standard).
When you say
char name4[10];
name4 = "Apple";
you get an error because you first declare an array of 10 chars (in other words, you are 'pointing' to the start of a 10-byte section of mutable memory), and then attempt to assign the immutable value "Apple"
to this array. In the latter case, the actual data allocation occurs in some read-only segment of memory.
This means that the types do not match:
error: incompatible types when assigning to type 'char[10]' from type 'char *'
If you want name4
to have the value "Apple"
, use strcpy
:
strcpy(name4, "Apple");
If you want name4
to have the initial value "Apple"
, you can do that as well:
char name4[10] = "Apple"; // shorthand for {'A', 'p', 'p', 'l', 'e', '\0'}
The reason that this works, whereas your previous example does not, is because "Apple"
is a valid array initialiser for a char[]
. In other words, you are creating a 10-byte char array, and setting its initial value to "Apple"
(with 0s in the remaining places).
This might make more sense if you think of an int array:
int array[3] = {1, 2, 3}; // initialise array
Probably the easiest colloquial explanation I can think of is that an array is a collection of buckets for things, whereas the static string "Apple"
is a single thing 'over there'.
strcpy(name4, "Apple")
works because it copies each of the things (characters) in "Apple"
into name4
one by one.
However, it doesn't make sense to say, 'this collection of buckets is equal to that thing over there'. It only makes sense to 'fill' the buckets with values.
I think this will help clear it up also:
int main() {
char* ptr = "Hello";
char arr[] = "Goodbye";
// These both work, as expected:
printf("%s\n", ptr);
printf("%s\n", arr);
printf("%s\n", &arr); // also works!
printf("ptr = %p\n", ptr);
printf("&ptr = %p\n", &ptr);
printf("arr = %p\n", arr);
printf("&arr = %p\n", &arr);
return 0;
}
Output:
Hello
Goodbye
Goodbye
ptr = 0021578C
&ptr = 0042FE2C
arr = 0042FE1C \__ Same!
&arr = 0042FE1C /
So we see that arr == &arr
. Since it's an array, the compiler knows that you are always going to want the address of the first byte, regardless of how it's used.
arr
is an array of 7+1 bytes, that are on the stack of main()
. The compiler generates instructions tho reserve those bytes, and then populate it with "Goodbye". There is no pointer!
ptr
on the other hand, is a pointer, a 4-byte integer, also on the stack. That's why &ptr
is very close to &arr
. But what it points to, is a static string ("Hello"), that is off in the read-only section of the executable (which is why ptr
's value is a very different number).
You cannot directly reassign a value to an array type (e.g. your array of ten char
s name4
). To the compiler, name4
is an "array" and you cannot use the assignment =
operator to write to an array with a string literal.
In order to actually move the content of the string "Apple" into the ten byte array you allocated for it (name4
), you must use strcpy()
or something of that sort.
What you are doing with name3
is pretty different. It is created as a char *
and initialized to garbage, or zero (you don't know for sure at this point). Then, you assign into it the location of the static string "Apple". This is a string that lives in read-only memory, and attempting to write to the memory that the name3
pointer points to can never succeed.
Based on this, you can surmise that the last statement attempts to assign the memory location of a static string to something somewhere else that represents a collection of 10 char
s. The language does not provide you with a pre-determined way to perform this task.
Herein lies its power.
Although pointer and array seems familiar, they are different. the char *name3
is just a pointer to char
, it takes no more memory than a pointer. it just store a address in it, so you can assign a string to it then the address stored in name3
change to the address of "Apple"
.
But your name4
is an array of char[10]
, it holds the memory of 10 chars, if you want to assign it, you need to use strcpy
or something to write it's memory, but not assign it with an address like "Apple"
.
You can initialize an array when you declare it, like this:
int n[5] = { 0, 1, 2, 3, 4 };
char c[5] = { 'd', 'a', 't', 'a', '\0' };
But since we typically treat char arrays as strings, C allows a special case:
char c[5] = "data"; // Terminating null character is added.
However, once you've declared an array, you can't reassign it. Why? Consider an assignment like
char *my_str = "foo"; // Declare and initialize a char pointer.
my_str = "bar"; // Change its value.
The first line declares a char pointer and "aims" it at the first letter in foo
. Since foo
is a string constant, it resides somewhere in memory with all the other constants. When you reassign the pointer, you're assigning a new value to it: the address of bar
. But the original string, foo
remains unchanged. You've moved the pointer, but haven't altered the data.
When you declare an array, however, you aren't declaring a pointer at all. You're reserving a certain amount of memory and giving it a name. So the line
char c[5] = "data";
starts with the string constant data
, then allocates 5 new bytes, calls them c
, and copies the string into them. You can access the elements of the array exactly as if you'd declared a pointer to them; arrays and pointers are (for most purposes) interchangeable in that way.
But since arrays are not pointers, you cannot reassign them.
You can't make c
"point" anywhere else, because it's not a pointer; it's the name of an area of memory.
You can change the value of the string, either one character at a time, or by calling a function like strcpy()
:
c[3] = 'e'; // Now c = "date", or 'd', 'a', 't', 'e', '\0'
strcpy(c, "hi"); // Now c = 'h', 'i', '\0', 'e', '\0'
strcpy(c, "too long!") // Error: overflows into memory we don't own.
Efficiency Tip
Note, also, that initializing an array generally makes a copy of the data: the original string is copied from the constant area to the data area, where your program can change it. Of course, this means you're using twice as much memory as you may have expected. You can avoid the copy and generally save memory by declaring a pointer instead. That leaves the string in the constant area and allocates only enough memory for a pointer, regardless of the length of the string.