read consecutive tabs as empty field fscanf

后端 未结 2 604
花落未央
花落未央 2021-01-24 23:06

I have a file that has certain fields separated by tabs. There will always be 12 tabs in a line, certain tabs are consecutive which indicates an empty field. I wanna use fscanf

相关标签:
2条回答
  • 2021-01-24 23:41

    fscanf is a non-starter. The only way to read empty fields would be to use "%c" to read delimiters (and that would require you to know which fields were empty beforehand -- not very useful) Otherwise, depending on the format specifier used, fscanf would simply consume the tabs as leading whitespace or experience a matching failure or input failure.

    Continuing from the comment, in order to tokenize based on delimiters that may separate empty fields, you will need to use strsep as strtok will consider consecutive delimiters as one.

    While your string is a bit unclear where the tabs are located, a short example of tokenizing with strsep could be as follows. Note that strsep takes a pointer-to-pointer as its first argument, e.g.

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int main (void) {
    
        int n = 0;
        const char *delim = "\t\n";
        char *s = strdup ("usrid\tUser Id 0\t15\tstring\td\tk\ty\ty\t\t\t0\t0"),
            *toks = s,   /* tokenize with separate pointer to preserve s */
            *p;
    
        while ((p = strsep (&toks, delim)))
            printf ("token[%2d]: '%s'\n", n++ + 1, p);
    
        free (s);
    }
    

    (note: since strsep will modify the address held by the string pointer, you need to preserve a pointer to the beginning of s so it can be freed when no longer needed -- thanks JL)

    Example Use/Output

    $ ./bin/strtok_tab
    token[ 1]: 'usrid'
    token[ 2]: 'User Id 0'
    token[ 3]: '15'
    token[ 4]: 'string'
    token[ 5]: 'd'
    token[ 6]: 'k'
    token[ 7]: 'y'
    token[ 8]: 'y'
    token[ 9]: ''
    token[10]: ''
    token[11]: '0'
    token[12]: '0'
    

    Look things over and let me know if you have further questions.

    0 讨论(0)
  • 2021-01-24 23:49

    I wanna use fscanf to read consecutive tabs as empty fields and store them in a structure.

    Ideally, code should read a line, as with fgets() and then parse the string.

    Yet staying with fscanf(), this can be done in a loop.


    The main idea is to use "%[^/t/n]" to read one token. If the next character is a '\t', then the return value will not be 1. Test for that. A width limit is wise.

    Then read the separator and look for tab, end-of-line or if end-of-file/error occurred.

    #define TABS_PER_LINE 12
    #define TOKENS_PER_LINE (TABS_PER_LINE + 1)
    #define TOKEN_SIZE 100
    #define TOKEN_FMT_N "99"
    
    int fread_tab_delimited_line(FILE *istream, int n, char token[n][TOKEN_SIZE]) {
      for (int i = 0; i < n; i++) {
        int token_count = fscanf(istream, "%" TOKEN_FMT_N "[^\t\n]", token[i]);
        if (token_count != 1) {
          token[i][0] = '\0';  // Empty token
        }
        char separator;
        int term_count = fscanf(istream, "%c", &separator);  // fgetc() makes more sense here
        // if end-of-file or end-of-line
        if (term_count != 1 || separator == '\n') {
          if (i == 0 && token_count != 1 && term_count != 1) {
            return 0;
          }
          return i + 1;
        }
        if (separator != '\t') {
          return -1;  // Token too long
        }
      }
      return -1;  // Token too many tokens found
    }
    

    Sample driving code

    void test_tab_delimited_line(FILE *istream) {
      char token[TOKENS_PER_LINE][TOKEN_SIZE];
      long long line_count = 0;
      int token_count;
      while ((token_count = fread_tab_delimited_line(istream, TOKENS_PER_LINE, token)) > 0) {
        printf("Line %lld\n", ++line_count);
        for (int i = 0; i < token_count; i++) {
          printf("%d: <%s>\n", i, token[i]);
        }
      } while (token_count > 0);
      if (token_count < 0) {
        puts("Trouble reading any tokens.");
      }
    }
    
    0 讨论(0)
提交回复
热议问题