Searching for strings that are NULL terminated within a file where they are not NULL terminated

大兔子大兔子 提交于 2019-12-25 08:58:33

问题


I am writing a program that opens two files for reading: the first file contains 20 names which I store in an array of the form Names[0] = John\0. The second file is a large text file that contains many occurences of each of the 20 names.

I need my program to scan the entirity of the second file and each time it finds one of the names, a variable Count is incremented and so on the completion of the program, the total number of all the names appearing in the text is stored in Count.

Here is my loop which searches for and counts the number of name occurences:

char LineOfText[85];
char *TempName;    

while(fgets(LineOfText, sizeof(LineOfText), fpn)){
    for(a = 0; a<NumOfNames; a++){
        TempName = strstr(LineOfText, Names[a]);
        if(TempName != NULL){
            Count++;
        }
    }
}

No matter what I do, this loop doesn't work as I would expect it to, but I have discovered what is wrong (I think!). My problem is that each name in the array is NULL terminated, but when a name appears in the text file it is not NULL terminated, unless it occurs as the last word of a line. Therefore, this while loop is only counting the number of times any of the names appear at the end of a line, rather than the number of appearances of any of the names anywhere in the text file. How can I adjust this loop to combat this problem?

Thank you for any advice in advance.


回答1:


The issue here is probably your use of fgets, which does not trim the newline from the line it reads.

If you are creating your names array by reading lines with fgets, then all the names will be terminated with a newline character. The lines in the file being read with fgets will also be terminated with a newline character, so the names will only match at the end of the lines.

strstr does not compare the NUL byte which terminates the pattern string, for obvious reasons. If it did, it would only match suffix strings, which would make it a very different function.

Also, you will only find a maximum of one instance of each name in each line. If you think that a name might appear more than once in the same line, you should replace:

 TempName = strstr(LineOfText, Names[a]);
 if(TempName != NULL){
    Count++;
 }

with something like:

 for (TempName = LineOfText;
      (TempName = strstr(TempName, Names[a]);
     ++Count, ++TempName) {
 }

For reference, here is the definition of fgets from the C standard (emphasis added):

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

This is different from gets, which does not retain the new-line character.




回答2:


I think the NULL termination of the names array is not an issue (See strstr function reference). The strstr function is not going to compare the terminator. You do have the possibility of missing additional names on each line. See my adjustment below for an example of how you could count multiple names on each line.

char LineOfText[85];
char *TempName;    

while(fgets(LineOfText, sizeof(LineOfText), fpn)){
    for(a = 0; a<NumOfNames; a++){
        TempName = strstr(LineOfText, Names[a]);

        /* Iterate through line for multiple occurrences of each name */
        while(TempName != NULL){
            Count++;

            /* Get next occurrence of name on line. fgets is going to
               leave a newline at the end of the LineOfText string so
               unless some of your names contain a newline, it shouldn't
               move past the end of the buffer */
            TempName = strstr(TempName + 1, Names[a]);
        }
    }
}


来源:https://stackoverflow.com/questions/30054679/searching-for-strings-that-are-null-terminated-within-a-file-where-they-are-not

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!