I have a file that has certain fields separated by tabs. There will always be 12 tabs in a line, certain tabs are consecutive which indicates an empty field. I wanna use fscanf
fscanf
is a non-starter. The only way to read empty fields would be to use "%c"
to read delimiters (and that would require you to know which fields were empty beforehand -- not very useful) Otherwise, depending on the format specifier used, fscanf
would simply consume the tabs
as leading whitespace or experience a matching failure or input failure.
Continuing from the comment, in order to tokenize based on delimiters that may separate empty fields, you will need to use strsep
as strtok
will consider consecutive delimiters as one.
While your string is a bit unclear where the tabs
are located, a short example of tokenizing with strsep
could be as follows. Note that strsep
takes a pointer-to-pointer as its first argument, e.g.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (void) {
int n = 0;
const char *delim = "\t\n";
char *s = strdup ("usrid\tUser Id 0\t15\tstring\td\tk\ty\ty\t\t\t0\t0"),
*toks = s, /* tokenize with separate pointer to preserve s */
*p;
while ((p = strsep (&toks, delim)))
printf ("token[%2d]: '%s'\n", n++ + 1, p);
free (s);
}
(note: since strsep
will modify the address held by the string pointer, you need to preserve a pointer to the beginning of s
so it can be freed when no longer needed -- thanks JL)
Example Use/Output
$ ./bin/strtok_tab
token[ 1]: 'usrid'
token[ 2]: 'User Id 0'
token[ 3]: '15'
token[ 4]: 'string'
token[ 5]: 'd'
token[ 6]: 'k'
token[ 7]: 'y'
token[ 8]: 'y'
token[ 9]: ''
token[10]: ''
token[11]: '0'
token[12]: '0'
Look things over and let me know if you have further questions.
I wanna use fscanf to read consecutive tabs as empty fields and store them in a structure.
Ideally, code should read a line, as with fgets()
and then parse the string.
Yet staying with fscanf()
, this can be done in a loop.
The main idea is to use "%[^/t/n]"
to read one token. If the next character is a '\t'
, then the return value will not be 1. Test for that. A width limit is wise.
Then read the separator and look for tab, end-of-line or if end-of-file/error occurred.
#define TABS_PER_LINE 12
#define TOKENS_PER_LINE (TABS_PER_LINE + 1)
#define TOKEN_SIZE 100
#define TOKEN_FMT_N "99"
int fread_tab_delimited_line(FILE *istream, int n, char token[n][TOKEN_SIZE]) {
for (int i = 0; i < n; i++) {
int token_count = fscanf(istream, "%" TOKEN_FMT_N "[^\t\n]", token[i]);
if (token_count != 1) {
token[i][0] = '\0'; // Empty token
}
char separator;
int term_count = fscanf(istream, "%c", &separator); // fgetc() makes more sense here
// if end-of-file or end-of-line
if (term_count != 1 || separator == '\n') {
if (i == 0 && token_count != 1 && term_count != 1) {
return 0;
}
return i + 1;
}
if (separator != '\t') {
return -1; // Token too long
}
}
return -1; // Token too many tokens found
}
Sample driving code
void test_tab_delimited_line(FILE *istream) {
char token[TOKENS_PER_LINE][TOKEN_SIZE];
long long line_count = 0;
int token_count;
while ((token_count = fread_tab_delimited_line(istream, TOKENS_PER_LINE, token)) > 0) {
printf("Line %lld\n", ++line_count);
for (int i = 0; i < token_count; i++) {
printf("%d: <%s>\n", i, token[i]);
}
} while (token_count > 0);
if (token_count < 0) {
puts("Trouble reading any tokens.");
}
}