We can assign a string in C as follows:
char *string;
string = \"Hello\";
printf(\"%s\\n\", string); // string
printf(\"%p\\n\", string); // memory-address
<
What makes a string fundamentally different than other primitive types?
A string seems like a primitive type in C because the compiler understands "foo"
and generates a null-terminated character array: ['f', 'o', 'o', '\0']
. But a C string is still just that: an array of characters.
My question then is, why can't we assign a pointer to a number in C just like we do with strings?
You certainly can assign a pointer to a number, it's just that a number isn't a pointer, whereas the value of an array is the address of the array. If you had an array of int
, then that would work just like a string. Compare your code:
char *string;
string = "Hello";
printf("%s\n", string); // string
printf("%p\n", string); // memory-address
to the analogous code for an array of integers:
int numbers[] = {1, 2, 3, 4, 5, 0};
int *nump = numbers;
printf("%d\n", nump[0]); // string
printf("%p\n", nump); // memory-address
The only real difference is that the compiler has some extra syntax for arrays of characters because they're so common, and printf()
similarly has a format specifier just for character arrays for the same reason.
The type pf a string literal (e.g. "hello world") is a char[]
. Where assigning char *string = "Hello"
means that string
now points to the start of the array (e.g. the address of the first memory address in the array: &char[0]
).
Whereas you can't assign an integer to a pointer because their types are different, one is a int
the other is a pointer int *
. You could cast it to the correct type:
int *num;
num = (int *) 4404;
But this would be considered quite dangerous (unless you really know what you are doing). I.e. do you know what is a memory adress 4404
?
why can't we assign a pointer to a number in C just like we do with strings?
int *num;
num = 4404;
Code can do that if 4404 is a valid address for an int
.
An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
C11dr §6.3.2.3 5
If the address is not properly aligned --> undefined behavior (UB).
If the address is a trap --> undefined behavior (UB).
Attempting to de-reference the pointer is a problem unless it points to a valid int
.
printf("%d\n", *num);
With below, "Hello"
is a string literal. It exist someplace. The assignment take the address of the string literal and assigns that to string
.
char *string;
string = "Hello";
The point is that that address assigned is known to be valid for a char *
.
In the num = 4404;
is not known to be valid (it likely is not).
What makes a string fundamentally different than other primitive types?
In C, a string is a C library specification, not a C language one. It is definition convenient to explaining various function therein.
A string is a contiguous sequence of characters terminated by and including the first null character §7.1.1 1
Primitive types are part of the C language.
The languages also has string literals like "Hello"
in char *string; string = "Hello";
. These have some similarity to strings, yet differ.
I recommend searching for "ISO/IEC9899:2017" to find a draft copy of the current C spec. It will answer many of your 10 question of the last week.
There is no such type as "string" in C. A string is not a primitive type. A string is just an array of characters, terminated by a NUL byte ('\0'
).
When you do this:
char *string;
string = "Hello";
What really happens is that the compiler is smart and creates a constant read only char
array and then assigns it to your variable string
. This can be done because in C the name of an array is the same as the pointer to its first element.
// This is placed in a different section:
const char hidden_arr[] = {'H', 'e', 'l', 'l', 'o', '\0'};
char *string;
string = hidden_arr;
// Same as:
string = &(hidden_arr[0]);
Here, hidden_arr
and string
are both char *
, because as we just said the name of an array is equal to the pointer to its first element. Of course, all of this is done transparently, you will not actually see another variable named hidden_arr
, that's just an example. In reality the string will be stored in some location in your executable without a name, and the address of that location will be copied to your string
pointer.
When you try to do the same with an integer, it's wrong because int *
and int
are different types, and you cannot write this (well, you can, but it's meaningless and does not do what you expect it to):
int *ptr;
ptr = 123;
But, you can very well do it with an array of integers:
int arr[] = {1, 2, 3};
int *ptr;
ptr = arr;
// Same as:
ptr = &(arr[0]);