How can a file contain null bytes?

前端 未结 6 1818
一整个雨季
一整个雨季 2021-02-03 23:17

How is it possible that files can contain null bytes in operating systems written in a language with null-terminating strings (namely, C)?

For example, if I run this she

6条回答
  •  情话喂你
    2021-02-03 23:57

    Null-terminated strings are a C construct used to determine the end of a sequence of characters intended to be used as a string. String manipulation functions such as strcmp, strcpy, strchr, and others use this construct to perform their duties.

    But you can still read and write binary data that contains null bytes within your program as well as to and from files. You just can't treat them as strings.

    Here's an example of how this works:

    #include 
    #include 
    
    int main()
    {
        FILE *fp = fopen("out1","w");
        if (fp == NULL) {
            perror("fopen failed");
            exit(1);
        }
    
        int a1[] = { 0x12345678, 0x33220011, 0x0, 0x445566 };
        char a2[] =  { 0x22, 0x33, 0x0, 0x66 };
        char a3[] = "Hello\x0World";
    
        // this writes the whole array
        fwrite(a1, sizeof(a1[0]), 4, fp);
        // so does this
        fwrite(a2, sizeof(a2[0]), 4, fp);
        // this does not write the whole array -- only "Hello" is written
        fprintf(fp, "%s\n", a3);
        // but this does
        fwrite(a3, sizeof(a3[0]), 12, fp);
        fclose(fp);
        return 0;
    }
    

    Contents of out1:

    [dbush@db-centos tmp]$ xxd out1
    0000000: 7856 3412 1100 2233 0000 0000 6655 4400  xV4..."3....fUD.
    0000010: 2233 0066 4865 6c6c 6f0a 4865 6c6c 6f00  "3.fHello.Hello.
    0000020: 576f 726c 6400                           World.
    

    For the first array, because we use the fwrite function and tell it to write 4 elements the size of an int, all the values in the array appear in the file. You can see from the output that all values are written, the values are 32-bit, and each value is written in little-endian byte order. We can also see that the second and fourth elements of the array each contain one null byte, while the third value being 0 has 4 null bytes, and all appear in the file.

    We also use fwrite on the second array, which contains elements of type char, and we again see that all array elements appear in the file. In particular, the third value in the array is 0, which consists of a single null byte that also appears in the file.

    The third array is first written with the fprintf function using a %s format specifier which expects a string. It writes the first 5 bytes of this array to the file before encountering the null byte, after which it stops reading the array. It then prints a newline character (0x0a) as per the format.

    The third array it written to the file again, this time using fwrite. The string constant "Hello\x0World" contains 12 bytes: 5 for "Hello", one for the explicit null byte, 5 for "World", and one for the null byte that implicitly ends the string constant. Since fwrite is given the full size of the array (12), it writes all of those bytes. Indeed, looking at the file contents, we see each of those bytes.

    As a side note, in each of the fwrite calls, I've hardcoded the size of the array for the third parameter instead of using a more dynamic expression such as sizeof(a1)/sizeof(a1[0]) to make it more clear exactly how many bytes are being written in each case.

提交回复
热议问题