问题
int main()
{
FILE *ft;
char ch;
ft=fopen("abc.txt","r+");
if(ft==NULL)
{
printf("can not open target file\n");
exit(1);
}
while(1)
{
ch=fgetc(ft);
if(ch==EOF)
{
printf("done");
break;
}
if(ch=='i')
{
fputc('a',ft);
}
}
fclose(ft);
return 0;
}
As one can see that I want to edit abc.txt
in such a way that i
is replaced by a
in it.
The program works fine but when I open abc.txt
externally, it seemed to be unedited.
Any possible reason for that?
Why in this case the character after i
is not replace by a
, as the answers suggest?
回答1:
Analysis
There are multiple problems:
fgetc()
returns anint
, not achar
; it has to return every validchar
value plus a separate value, EOF. As written, you can't reliably detect EOF. Ifchar
is an unsigned type, you'll never find EOF; ifchar
is a signed type, you'll misidentify some valid character (often ÿ, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS) as EOF.If you switch between input and output on a file opened for update mode, you must use a file positioning operation (
fseek()
,rewind()
, nominallyfsetpos()
) between reading and writing; and you must use a positioning operation orfflush()
between writing and reading.It is a good idea to close what you open (now fixed in the code).
If your writes worked, you'd overwrite the character after the
i
witha
.
Synthesis
These changes lead to:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *ft;
char const *name = "abc.txt";
int ch;
ft = fopen(name, "r+");
if (ft == NULL)
{
fprintf(stderr, "cannot open target file %s\n", name);
exit(1);
}
while ((ch = fgetc(ft)) != EOF)
{
if (ch == 'i')
{
fseek(ft, -1, SEEK_CUR);
fputc('a',ft);
fseek(ft, 0, SEEK_CUR);
}
}
fclose(ft);
return 0;
}
There is room for more error checking.
Exegesis
Input followed by output requires seeks
The fseek(ft, 0, SEEK_CUR);
statement is required by the C standard.
ISO/IEC 9899:2011 §7.21.5.3 The
fopen
function¶7 When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the
fflush
function or to a file positioning function (fseek
,fsetpos
, orrewind
), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of- file. Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.
(Emphasis added.)
fgetc()
returns an int
Quotes from ISO/IEC 9899:2011, the current C standard.
§7.21 Input/output
<stdio.h>
§7.21.1 Introduction
EOF
which expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file, that is, no more input from a stream;§7.21.7.1 The
fgetc
function
int fgetc(FILE *stream);
¶2 If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the
fgetc
function obtains that character as anunsigned char
converted to anint
and advances the associated file position indicator for the stream (if defined).Returns
¶3 If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the
fgetc
function returns EOF. Otherwise, thefgetc
function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and thefgetc
function returns EOF.289)289) An end-of-file and a read error can be distinguished by use of the
feof
andferror
functions.
So, EOF
is a negative integer (conventionally it is -1, but the standard does not require that). The fgetc()
function either returns EOF or the value of the character as an unsigned char
(in the range 0..UCHAR_MAX, usually 0..255).
§6.2.5 Types
¶3 An object declared as type
char
is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in achar
object, its value is guaranteed to be nonnegative. If any other character is stored in achar
object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.¶5 An object declared as type
signed char
occupies the same amount of storage as a ‘‘plain’’char
object.§6 For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword
unsigned
) that uses the same amount of storage (including sign information) and has the same alignment requirements.§15 The three types
char
,signed char
, andunsigned char
are collectively called the character types. The implementation shall definechar
to have the same range, representation, and behavior as eithersigned char
orunsigned char
.45)45)
CHAR_MIN
, defined in<limits.h>
, will have one of the values0
orSCHAR_MIN
, and this can be used to distinguish the two options. Irrespective of the choice made,char
is a separate type from the other two and is not compatible with either.
This justifies my assertion that plain char
can be a signed or an unsigned type.
Now consider:
char c = fgetc(fp);
if (c == EOF)
…
Suppose fgetc()
returns EOF, and plain char
is an unsigned (8-bit) type, and EOF is -1
. The assignment puts the value 0xFF into c
, which is a positive integer. When the comparison is made, c
is promoted to an int
(and hence to the value 255), and 255 is not negative, so the comparison fails.
Conversely, suppose that plain char
is a signed (8-bit) type and the character set is ISO 8859-15. If fgetc()
returns ÿ, the value assigned will be the bit pattern 0b11111111, which is the same as -1
, so in the comparison, c
will be converted to -1
and the comparison c == EOF
will return true even though a valid character was read.
You can tweak the details, but the basic argument remains valid while sizeof(char) < sizeof(int)
. There are DSP chips where that doesn't apply; you have to rethink the rules. Even so, the basic point remains; fgetc()
returns an int
, not a char
.
If your data is truly ASCII (7-bit data), then all characters are in the range 0..127 and you won't run into the misinterpretation of ÿ problem. However, if your char
type is unsigned, you still have the 'cannot detect EOF' problem, so your program will run for a long time. If you need to consider portability, you will take this into account. These are the professional grade issues that you need to handle as a C programmer. You can kludge your way to programs that work on your system for your data relatively easily and without taking all these nuances into account. But your program won't work on other people's systems.
回答2:
You are not changing the 'i' in abc.txt
, you are changing the next character after 'i'. Try to put fseek(ft, -1, SEEK_CUR);
before your fputc('a', ft);
.
After you read a 'i' character, the file position indicator of ft
will be the character after this 'i', and when you write a character by fputc()
, this character will be write at the current file position, i.e. the character after 'i'. See fseek(3) for further details.
回答3:
After reading 'i' you need to "step back" to write to the correct location.
if(ch=='i')
{
fseek(ft, -1, SEEK_CUR);
fputc('a',ft);
}
来源:https://stackoverflow.com/questions/21958155/modify-existing-contents-of-file-in-c