I have a big problem with C language when it comes to strings, char *
\'s or whatever... So in this particular case I have a huge problem. I want to create an array
How can I announce the array and then define it's size?
Don't; 'announce' it when you know what size it needs to be. You can't use it before then anyway.
In C99 and later, you can define variables when needed — anywhere in a statement block. You can also use VLAs (variable-length arrays) where the size is not known until runtime. Don't create enormous arrays as VLAs (e.g. 1 MiB or more — but tune the limit to suit your machine and prejudices); use dynamic memory allocation after all.
If you're stuck with the archaic C89/C90 standard, then you can only define variables at the start of a block, and arrays have sizes known at compile time, so you have to use dynamic memory allocation — malloc()
, free()
etc.
First declare a pointer to a "char". Then ask (the system) for space to store your required number of values using malloc and then add elements to this "array".
char * test;
int num_of_elements = 99;
test = malloc(sizeof(char) * num_of_elements);
//test points to first array elament
test[0] = 11;
test[1] = 22;
//etc
Depending upon your (i) tool-chain and (ii) how and when you will know you the size - you have the option to use either (a) Variable Length Arrays or (b) Dynamic Memory Allocation functions.
if (tool-chain supports C99 and later) or (you will know the array length at runtime) use Variable Length Array
if (older tool-chain) or (you want the flexibility of allocating and releasing the memory) use Dynamic Memory allocation function
here are samples
void f(int m, char C[m][m])
{
char test[m];
:
}
or
void somefunc(int n)
{
char *test;
test = malloc (n * sizeof (char));
// check for test != null
// use test
free (test);
}
can be written using VLA as
int n = 5;
char test[n];
There are two ways of solving this issue.
Method #1: Use a maximum size to define your array. Here's what the code looks like:
char test[max_size];
You then need to keep track of how many elements are actually used up. This is used commonly in some old-school networking code.
Method #2: Use dynamic memory. Note that there is a bit of a performance issue here (potentially) since you have the ask the OS each time for a chunk of memory. There is an answer up here already that shows you how to do this. Just be sure to call free() once you are done using your array.
Declaring Static Character Arrays (strings)
When you know (or have a reasonable idea how large your array needs to be, you can simply declare an array of sufficient size to handle your input (i.e. if you names are no longer than 25 characters, then you could safely declare name[26]
. For strings, you always need at minimum the number of chars to store
+ 1
(for the null-terminating character).
If there may be a few characters more than 25, there's nothing wrong with declaring your array few bytes longer than needed to protect against accidental writing beyond the end of the array. Say name[32]
.
Let's declare an array of 5-characters
below and look at how the information is stored in memory.
char name[5] = {0}; /* always initialize your arrays */
The above declaration creates an array of 5-contiguous bytes
on the stack for your use. e.g., you can visualize the 5-bytes of memory initialized to zero as follows:
+---+---+---+---+---+
name | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+
\
'name' holds the starting address for
the block of memory that is the array
(i.e. the address of the first element
like: 0x7fff050bf3d0)
Note: when using name
to hold a 'character string', the actual string can be no longer than 4 chars because you must end a sting with a null-terminating character, which is the null-character '\0'
(or simply numeric 0
, both are equivalent)
To store information in name
, you can either do it by assigning characters one-at-a-time:
name[0] = 'J';
name[1] = 'o';
name[2] = 'h';
name[3] = 'n';
name[4] = 0; /* null-terminate. note: this was already done by initialization */
In memory you now have:
+---+---+---+---+---+
name | J | o | h | n | 0 | /* you actually have the ASCII value for */
+---+---+---+---+---+ /* each letter stored in the elements */
Of course, nobody assigns one character at a time in this manner. You options are many, using one of the functions provided by C, e.g. strcpy
, strncpy
, memcpy
, or by reading information from a file stream, or file descriptor with fgets
, getline
, or by using a simple loop
and index variable
to do the assignment, or by using one of the string formatting functions, e.g. sprintf
, etc... For example you can accomplish the same thing with any of the following:
/* direct copy */
strcpy (name, "John");
strncpy (name, "John", 5);
memcpy (name, "John", sizeof "John"); /* include copy of the terminating char */
/* read from stdin into name */
printf ("Please enter a name (4 char max): ");
scanf ("%[^\n]%*c", name);
Note: above with strncpy
, if you had NOT initialized all element to 0
(the last of which will serve as your null-terminating
character, and then used strncpy (name, "John", 4);
you would need to manually terminate the string with name[4] = 0;
, otherwise you would not have a valid string (you would have an unterminated array of chars which would lead to undefined behavior
if you used name
where a string was expected.)
If you do not explicitly understand this STOP, go read and understand what a null-terminated string
is and how it differs from an array of characters
. Seriously, stop now and go learn, it is that fundamental to C. (if it doesn't end with a null-terminating
character - it isn't a c-string.
What if I don't know how many characters I need to store?
Dynamic Allocations of Character Strings
When you do not know how many characters you need to store (or generally how many of whatever data type), the normal approach is to declare a pointer to type, and then allocate a reasonably anticipated amount of memory (just based on your best understanding of what you are dealing with), and then reallocate to add additional memory as required. There is no magic to it, it is just a different way of telling the compiler how to manage the memory. Just remember, when you allocate the memory, you own it. You are responsible for (1) preserving a pointer to the beginning address of the memory block (so it can be freed later); and (2) freeing the memory when you are done with it.
A simple example will help. Most of the memory allocation/free functions are declared in stdlib.h
.
char *name = NULL; /* declare a pointer, and initialize to NULL */
name = malloc (5 * sizeof *name); /* allocate a 5-byte block of memory for name */
if (!name) { /* validate memory was allocated -- every time */
fputs ("error: name allocation failed, exiting.", stderr);
exit (EXIT_FAILURE);
}
/* Now use name, just as you would the statically declared name above */
strncpy (name, "John", 5);
printf (" name contains: %s\n", name);
free (name); /* free memory when no longer needed.
(if reusing name, set 'name = NULL;')
*/
Note: malloc
does NOT initialize the contents of the memory it allocates. If you want to initialize your new block of memory with zero (as we did with the static array), then use calloc
instead of malloc
. You can also use malloc
and then call memset
as well.
What happens if I allocate memory, then need More?
As mentioned above discussing dynamic memory, the general scheme is to allocate a reasonable anticipated amount, then realloc
as required. You use realloc
to reallocate the original block of memory created by malloc
. realloc
essentially creates a new block of memory, copies the memory from your old block to the new, and then frees the old block of memory. Since the old block of memory is freed, you want to use a temporary pointer for reallocation. If reallocation fails, you still have your original block of memory available to you.
You are free to add as little or as much memory as you like at any call to realloc
. The standard scheme usually seen is to start with some initial allocation, then reallocate twice that amount each time you run out. (the means you need to keep track of how much memory is currently allocated).
To sew this up, let's end with a simple example that simply reads a string of any length as the first argument to the program (use "
quotes"
if your string contains whitespace). It will then allocates space to hold the string, then reallocate to append more text to the end of the original string. Finally it will free all memory in use before exit:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char **argv) {
if (argc < 2) { /* validate input */
fprintf (stderr, "error: insufficient input. usage: %s \"name\"\n",
argv[0]);
return 1;
}
size_t len = strlen (argv[1]); /* length of input */
size_t sz_mem = len + 1; /* memory required */
char *name = malloc (sz_mem * sizeof *name); /* allocate memory for name */
if (!name) { /* validate memory created successfully or throw error */
fputs ("error: name allocation failed, exiting.", stderr);
return 1;
}
printf ("\n allocated %zu bytes of memory for 'name'\n", sz_mem);
memset (name, 0, sz_mem); /* initialize memory to zero (optional) */
strncpy (name, argv[1], sz_mem); /* copy the null-terminator as well */
printf (" name: '%s' (begins at address: %p)\n", name, name);
/* realloc - make name twice as big */
void *tmp = realloc (name, 2 * sz_mem); /* use a temporary pointer */
if (!tmp) { /* check realloc succeeded */
fprintf (stderr, "error: virtual memory exhausted, realloc 'name'\n");
return 1;
}
memset (tmp + sz_mem, 0, sz_mem * sizeof *name); /* zero new memory */
name = tmp; /* assign new block to name */
sz_mem += sz_mem; /* update current allocation size */
printf (" reallocated 'name' to %zu bytes\n", sz_mem);
strncat (name, " reallocated", sizeof " reallocated");
printf ("\n final name : '%s'\n\n", name);
free (name);
return 0;
}
Use/Output
$ ./bin/arraybasics "John Q. Public"
allocated 15 bytes of memory for 'name'
name: 'John Q. Public' (begins at address: 0xf17010)
reallocated 'name' to 30 bytes
final name : 'John Q. Public reallocated'
Memory Check
When you dynamically allocate memory, it is up to you to validate you are using the memory correctly and that you track and free all the memory you allocate. Use a memory error checker like valgrind
to veryify your memory use is correct. (there is no excuse not to, it is dead-bang-simple to do) Just type valgrind yourprogramexe
$ valgrind ./bin/arraybasics "John Q. Public"
==19613== Memcheck, a memory error detector
==19613== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==19613== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==19613== Command: ./bin/arraybasics John\ Q.\ Public
==19613==
allocated 15 bytes of memory for 'name'
name: 'John Q. Public' (begins at address: 0x51e0040)
reallocated 'name' to 30 bytes
final name : 'John Q. Public reallocated'
==19613==
==19613== HEAP SUMMARY:
==19613== in use at exit: 0 bytes in 0 blocks
==19613== total heap usage: 2 allocs, 2 frees, 45 bytes allocated
==19613==
==19613== All heap blocks were freed -- no leaks are possible
==19613==
==19613== For counts of detected and suppressed errors, rerun with: -v
==19613== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
In the output, the following lines are of particular significance:
==19613== HEAP SUMMARY:
==19613== in use at exit: 0 bytes in 0 blocks
==19613== total heap usage: 2 allocs, 2 frees, 45 bytes allocated
This tells you that all memory allocated during your program has been properly freed. (make sure you close all open file streams, they are dynamically allocated as well).
Of equal importance is the ERROR SUMMARY
:
==19613== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
There are no errors in the memory use. If you attempt to read, write or free memory from a location outside your block, or from an unitialized location or that would leave other memory unreachable, that information will show as an error.
(the suppressed: 2 from 2 just relate to additional debug libraries not present on my system)
This ended up longer than intended, but if it helps, it was worth it. Good luck.