I\'m really stuck on something.
I have a text file, which has 1 word followed by ~100 float numbers. The float numbers are separated by space, tab, or newline. This
As per the standard:
An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.
The likely reason that nine
is giving you ne
is because, when reading a double value, nan
is one of the acceptable values. Hence, the n
and i
are read to establish that it's not nan
.
Similarly, with the word in
, that a valid prefix for inf
representing infinity.
The standard also states in a footnote:
fscanf pushes back at most one input character onto the input stream.
so it's quite possible that this is why the i
in nine
is not being pushed back.
Bottom line is that it's basically unsafe to assume where the file pointer will end up when fscanf
operations fail for some reason.
One way to fix this is to use ftell
and fseek
to save the file pointer for each successfully item, so that you can move back to the correct file position if the thing you're attempting to read is not successful.
Let's say you have the input file:
one 1 2 3 4 5
nine 9 8 7 6 5
in 3.14159 2.71828
The following code will save and restore file positions to make it work as you wish:
#include <stdio.h>
int main(void) {
char buff[50]; double dbl; size_t pos;
FILE *fin = fopen("inputFile.txt", "r");
while (fscanf(fin, "%s", buff) == 1) {
printf("Got string [%s]\n", buff);
pos = ftell(fin);
while (sscanf(buff, "%lf", &dbl) == 1) {
printf("Got double [%f]\n", dbl);
pos = ftell(fin);
}
fseek(fin, pos, SEEK_SET);
}
fclose(fin);
return 0;
}
By commenting out the fseek
, you can see similar behaviour to what you describe:
Got string [one]
Got double [1.000000]
Got double [2.000000]
Got double [3.000000]
Got double [4.000000]
Got double [5.000000]
Got string [ne]
Got double [9.000000]
Got double [8.000000]
Got double [7.000000]
Got double [6.000000]
Got double [5.000000]
Got double [3.141590]
Got double [2.718280]
I consider this solution a little messy in that it's continuously having to call ftell
and occasionally fseek
to get it to work.
Another way is to just read everything as strings and decide whether it's a numeric or string with a sscanf
operation after reading it in, as in the following code (with the afore-mentioned input file):
#include <stdio.h>
int main(void) {
char buff[50]; double dbl;
FILE *fin = fopen("inputFile.txt", "r");
while (fscanf(fin, "%s", buff) == 1) {
if (sscanf(buff, "%lf", &dbl) == 1) {
printf("Got double [%f]\n", dbl);
} else {
printf("Got string [%s]\n", buff);
}
}
fclose(fin);
return 0;
}
This works because a floating point value is actually a proper subset of a string (i.e., it has no embedded spaces).
The output of both those programs above is:
Got string [one]
Got double [1.000000]
Got double [2.000000]
Got double [3.000000]
Got double [4.000000]
Got double [5.000000]
Got string [nine]
Got double [9.000000]
Got double [8.000000]
Got double [7.000000]
Got double [6.000000]
Got double [5.000000]
Got string [in]
Got double [3.141590]
Got double [2.718280]
which is basically what was desired.
One thing you need to be aware of is that scanning something like inf
or nan
as a double will actually work - that is the intended behaviour of the library (and how your original code would have worked had it not had the issues). If that's not acceptable, you can do something like evaluate the string before trying to scan it as a double, to ensure it's not one of those special values.