问题
I have a file that has certain fields separated by tabs. There will always be 17 tabs but there order can vary, such as..
75104\tDallas\t85\t34.46\t45.64
75205\tHouston\t\t37.34\t87.32
93434\t\t\t1.23\t3.32
When I use strtok
in the following fashion
while (fgets(buf, sizeof(buf), fp) != NULL) {
tok = strtok(buf,"\t");
while(tok != NULL) {
printf("%s->",tok);
tok = strtok(NULL,"\t");
}
}
I get all the tokens, but double tabs \t\t
or more are ignored. However, I need to know when a field is empty, I cannot have strtok
ignore multiple tabs because the structure depends on 17 tabs being counted, using a placeholder if a field is empty.
I've tried dealing with the problem with an
if(tok == NULL || '')
but I don't think strtok
recognizes a tab after a tab. What is the best way to deal with this issue?
回答1:
You can't use strtok in your case.
From man strtok:
The strtok() function breaks a string into a sequence of zero or more nonempty tokens ... From the above description, it follows that a sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter, and that delimiter bytes at the start or end of the string are ignored. Put another way: the tokens returned by strtok() are always nonempty strings. Thus, for example, given the string "aaa;;bbb,", successive calls to strtok() that specify the delimiter string ";," would return the strings "aaa" and "bbb", and then a null pointer
So you will have to find an alternative, which could either be manually writing a function that uses linear search and strncpy
, or sscanf
or using strsep
, if it is available. The latter would very likely be my choice, because it was intended as replacement for strtok.
From man strsep:
The strsep() function was introduced as a replacement for strtok(3), since the latter cannot handle empty fields. However, strtok(3) con‐ forms to C89/C99 and hence is more portable.
回答2:
Here's a solution using strsep
, which was introduced specifically to address the fact that strtok
skips over consecutive delimiters:
char *cur, *nxt;
while (fgets(buf, sizeof(buf), fp) != NULL)
{
nxt = buf;
while ((cur = strsep(&nxt, "\t")) != NULL)
{
printf("%s->",cur);
}
}
NOTE: the string passed to strsep
must be writable (passing a literal string specifically does not work). It will be modified by strsep
(delimiters are overwritten with NUL characters on consecutive calls).
回答3:
A good approach to start digesting how this could be implemented , the below function will do that , read it please. :
int splitLine(char *buf, char **argv, int max_args)
{
int arg;
/* skip over initial spaces */
while (isspace(*buf)) buf++;
for (arg = 0; arg < max_args
&& *buf != '\0'; arg++) {
argv[arg] = buf;
/* skip past letters in word */
while (*buf != '\0'
&& !isspace(*buf)) {
buf++;
}
/* if not at line's end, mark
* word's end and continue */
if (*buf != '\0') {
*buf = '\0';
buf++;
}
/* skip over extra spaces */
while (isspace(*buf)) buf++;
}
return arg;
}
This function use a space separator, you can reimplement to use any other one.
来源:https://stackoverflow.com/questions/36319131/split-string-in-c-to-recognize-consecutive-tabs