I\'m a total C newbie, I come from C#. I\'ve been learning about memory management and the malloc()
function. I\'ve also came across this code:
char
Well, for a start, sizeof(char)
is always 1, so you could just malloc(3)
.
What you're allocating there is enough space for three characters. But keep in mind you need one for a null terminator for C strings.
What you tend to find is things like:
#define NAME_SZ 30
: : :
char *name = malloc (NAME_SZ+1);
to get enough storage for a name and terminator character (keeping in mind that the string "xyzzy" is stored in memory as:
+---+---+---+---+---+----+
| x | y | z | z | y | \0 |
+---+---+---+---+---+----+
Sometimes with non-char based arrays, you'll see:
int *intArray = malloc (sizeof (int) * 22);
which will allocate enough space for 22 integers.
This will allocate three bytes; 1 for sizeof(char), plus two. Just seeing that line out of context, I have no way of knowing why it would be allocated that way or if it is correct (it looks fishy to me).
You need to allocate enough memory to hold whatever you need to put in it. For example, if you're allocating memory to hold a string, you need to allocate enough memory to hold the longest string expected plus one byte for the terminating null. If you're dealing with ASCII strings, that's easy: one byte per character plus one. If you're using unicode strings, things get more complicated.
Your call to malloc
will allocate 3 bytes of memory. sizeof(char)
is 1 byte and 2 bytes are indicated explicitly. This gives you enough space for a string of size 2 (along with the termination character)
First point - it is a good habit to never put absolute numbers in the argument to malloc, always use sizeof and a multiple. As said above, the memory allocated for some types varies with compiler and platform. In order to guarantee gettin enough space for an array of type 'blob' it is best to use something like this:
blob *p_data = malloc(sizeof(blob) * length_of_array);
This way, whatever the type is, however it looks in memory you'll get exactly the right amount.
Secondly, segfaults etc. C, as a low level language, has no bounds checking. This means that there is nothing to check you are looking at an index not actually in the array. In fact it doesn't stop you accessing memory anywhere even if it doesn't belong to your program (although your operating system might, thats what a segfault is). This is why, whenever you pass an array around in C you need to pass its length as well, so that the function receiving the array knows how big it is. Don't forget that an 'array' is really just a pointer to the first element. This is very unhelpful when passing strings around - every string argument would become two arguments, so a cheat is used. Any standard C string is NULL terminated. The last character in the string should be ASCII value 0. Any string functions work along the array until they see that and then stop. This way they don't overrun the array, but if its not there for some reason, they will. That being understood
strlen("Hello")
is 5, but to store it you need one more character. E.g.:
const char str1 = "Hello";
char *str2 = malloc(sizeof(char) * (strlen(str1) + 1));
strcpy(str2, str1);
And yes, sizeof(char) is unnecessary because it is defined to be 1, but I find it clearer and it is definitely a good habit.
That snippet is allocating enough space for a 2-character name.
Generally the string buffer is going to be filled from somewhere, i.e. I/O. If the size of the string isn't known ahead of time (e.g. reading from file or keyboard), one of three approaches are generally used:
Define a maximum size for any given string, allocate that size + 1 (for the null terminator), read at most that many characters, and error or blindly truncate if too many characters were supplied. Not terribly user friendly.
Reallocate in stages (preferably using geometric series, e.g. doubling, to avoid quadratic behaviour), and keep on reading until the end has been reached. Not terribly easy to code.
Allocate a fixed size and hope it won't be exceeded, and crash (or be owned) horribly when this assumption fails. Easy to code, easy to break. For example, see gets
in the standard C library. (Never use this function.)
malloc()
will allocate a block of memory and return a pointer to that memory if successful, and NULL if unsuccessful. the size of the block of memory is specified by malloc
's argument, in bytes.
the sizeof
operator gives the size of its argument in bytes.
char *someString = malloc(sizeof(char) * 50)
this will allocate enough space for a 49 character string (a C-style string must be terminated by a NULL ('\0'
) character) not including the NULL character, and point someString
at that memory.
It looks like that code in your question should be malloc(sizeof(char) * 2);
, as sizeof(char) + 2
doesn't make sense.
note that sizeof(char)
is guaranteed to always equal 1 (byte) -- but the memory representation of other types (such as long) may vary between compilers.
The way that you get (un)lucky with dynamically allocated memory is if you try to read/write outside of memory you have allocated.
For example,
char *someString = malloc(10);
strcpy(someString, "Hello there, world!");
printf("%s\n", someString);
The first line allocates enough room for 9 characters, and a NULL character.
The second line attempts to copy 20 characters (19 + NULL) into that memory space. This overruns the buffer and might cause something incredibly witty, such as overwriting adjacent memory, or causing a segfault.
The third line might work, for example if there was allocated memory right beside someString, and "Hello there, world!" ran into that memory space, it might print your string plus whatever was in the next memory space. If that second space was NULL terminated, it would then stop--unless it wasn't, in which case it would wander off and eventually segfault.
This example is a pretty simple operation, yet it's so easy to go wrong. C is tricky -- be careful.