What is the easiest way to count the newlines in an ASCII file?

前端 未结 5 870
一生所求
一生所求 2020-11-30 10:01

Which is the fastest way to get the lines of an ASCII file?

相关标签:
5条回答
  • 2020-11-30 10:34

    Normally you read files in C using fgets. You can also use scanf("%[^\n]"), but quite a few people reading the code are likely to find that confusing and foreign.

    Edit: on the other hand, if you really do just want to count lines, a slightly modified version of the scanf approach can work quite nicely:

    while (EOF != (scanf("%*[^\n]"), scanf("%*c"))) 
        ++lines;
    

    The advantage of this is that with the '*' in each conversion, scanf reads and matches the input, but does nothing with the result. That means we don't have to waste memory on a large buffer to hold the content of a line that we don't care about (and still take a chance of getting a line that's even larger than that, so our count ends up wrong unless we got to even more work to figure out whether the input we read ended with a newline).

    Unfortunately, we do have to break up the scanf into two pieces like this. scanf stops scanning when a conversion fails, and if the input contains a blank line (two consecutive newlines) we expect the first conversion to fail. Even if that fails, however, we want the second conversion to happen, to read the next newline and move on to the next line. Therefore, we attempt the first conversion to "eat" the content of the line, and then do the %c conversion to read the newline (the part we really care about). We continue doing both until the second call to scanf returns EOF (which will normally be at the end of the file, though it can also happen in case of something like a read error).

    Edit2: Of course, there is another possibility that's (at least arguably) simpler and easier to understand:

    int ch;
    
    while (EOF != (ch=getchar()))
        if (ch=='\n')
            ++lines;
    

    The only part of this that some people find counterintuitive is that ch must be defined as an int, not a char for the code to work correctly.

    0 讨论(0)
  • 2020-11-30 10:34

    Maybe I'm missing something, but why not simply:

    #include <stdio.h>
    int main(void) {
      int n = 0;
      int c;
      while ((c = getchar()) != EOF) {
        if (c == '\n')
          ++n;
      }
      printf("%d\n", n);
    }
    

    if you want to count partial lines (i.e. [^\n]EOF):

    #include <stdio.h>
    int main(void) {
      int n = 0;
      int pc = EOF;
      int c;
      while ((c = getchar()) != EOF) {
        if (c == '\n')
          ++n;
        pc = c;
      }
      if (pc != EOF && pc != '\n')
        ++n;
      printf("%d\n", n);
    }
    
    0 讨论(0)
  • 2020-11-30 10:47

    What about this?

    #include <stdio.h>
    #include <string.h>
    
    #define BUFFER_SIZE 4096
    
    int main(int argc, char** argv)
    {
        int count;
        int bytes;
        FILE* f;
        char buffer[BUFFER_SIZE + 1];
        char* ptr;
    
        if (argc != 2 || !(f = fopen(argv[1], "r")))
        {
            return -1;
        }
    
        count = 0;
        while(!feof(f))
        {
            bytes = fread(buffer, sizeof(char), BUFFER_SIZE, f);
            if (bytes <= 0)
            {
                return -1;
            }
    
            buffer[bytes] = '\0';
            for (ptr = buffer; ptr; ptr = strchr(ptr, '\n'))
            {
                ++count;
                ++ptr;
            }
        }
    
        fclose(f);
    
        printf("%d\n", count - 1);
    
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-30 10:48

    Here's a solution based on fgetc() which will work for lines of any length and doesn't require you to allocate a buffer.

    #include <stdio.h>
    
    int main()
    {
        FILE                *fp = stdin;    /* or use fopen to open a file */
        int                 c;              /* Nb. int (not char) for the EOF */
        unsigned long       newline_count = 0;
    
            /* count the newline characters */
        while ( (c=fgetc(fp)) != EOF ) {
            if ( c == '\n' )
                newline_count++;
        }
    
        printf("%lu newline characters\n", newline_count);
        return 0;
    }
    
    0 讨论(0)
  • 2020-11-30 10:53

    Common, why You compare all characters? It is very slow. In 10MB file it is ~3s.
    Under solution is faster.

    unsigned long count_lines_of_file(char *file_patch) {
        FILE *fp = fopen(file_patch, "r");
        unsigned long line_count = 0;
    
        if(fp == NULL){
            return 0;
        }
        while ( fgetline(fp) )
            line_count++;
    
        fclose(fp);
        return line_count;
    }
    
    0 讨论(0)
提交回复
热议问题