问题
Is using fseek
to backtrack character fscanf
operations reliable?
Like for example if I have just fscanf
-ed 10 characters but I would like to backtrack the 10 chars can I just fseek(infile, -10, SEEK_CUR)
?
For most situations it works but I seem to have problems with the character ^M
. Apparently fseek
registers it as a char but fscanf
doesn't register it, thus in my previous example a 10 char block containing a ^M
would require fseek(infile, -11, SEEK_CUR)
instead. fseek(infile, -10, SEEK_CUR)
would make bring it short by 1 character.
Why is this so?
Edit: I was using fopen
in text mode
回答1:
You're seeing the difference between a "text" and a "binary" file. When a file is opened in text mode (no 'b' in the fopen second argument), the stdio library may (indeed, must) interpret the contents of the file according to the operating system's conventions for text files. For example, in Windows, a line ends with \r\n, and this gets translated to a single \n by stdio, since that is the C convention. When writing to a text file, a single \n gets output as \r\n.
This makes it easier to write portable C programs that handle text files. Some details become complicated, however, and fseeking is one of them. Because of this, the C standard only defines fseek in text files in a few cases: to the very beginning, to the very end, to the current position, and to a previous position that has been retrieved with ftell. In other words, you can't compute a location to seek to for text files. Or you can, but you have to take care of the all the platform-specific details yourself.
Alternatively, you can use binary files and do the line-ending transformations yourself. Again, portability suffers.
In your case, if you just want to go back to where you last did fscancf, the easiest would be to use ftell just before you fscanf.
回答2:
This is because fseek works with bytes, whereas fscanf intelligently handles that the carriage return and line feed are two bytes, and swallows them as one char.
回答3:
Fseek has no understanding of the file's contents and just moves the filepointer 10 characters back.
fscanf depending on the OS, may interpret newlines differently; it may even be so that fscanf will insert the ^M if you're on DOS and the ^M does not appear in the file. Check your manual that came with your C compiler
回答4:
Just tried this with VS2008 and found that fscanf and fseek treated the CR and LF characters in the same way (as a single character).
So with two files:
0000000: 3132 3334 3554 3738 3930 3132 3334 3536 12345X7890123456
and
0000000: 3132 3334 350d 0a37 3839 3031 3233 3435 12345..789012345
If I read 15 characters I get to the second '5', then seek back 10 characters, my next character read is the 'X' in the first case and the CRLF in the second.
This seems like a very OS/compiler specific problem.
回答5:
Did you test the return value of fscanf
? Post some code.
Take a look at ungetc. You may have to run a loop over it.
来源:https://stackoverflow.com/questions/780303/using-fseek-to-backtrack