问题
I have a txt file where I just want to get lines bigger than 12 characters. Than these lines are inserted in a variable called graph of type graph_node.
txt file:
1,"execCode(workStation,root)","OR",0
2,"RULE 4 (Trojan horse installation)","AND",0
3,"accessFile(workStation,write,'/usr/local/share')","OR",0
4,"RULE 16 (NFS semantics)","AND",0
5,"accessFile(fileServer,write,'/export')","OR",0
6,"RULE 10 (execCode implies file access)","AND",0
7,"canAccessFile(fileServer,root,write,'/export')","LEAF",1
6,7,-1
graph node type:
#ifndef Graph_Structure
#define Graph_Structure
struct Graph_Node{
char id[50];
char node_name[50];
struct Graph_Node* next;
struct Graph_Node* edges;
};
typedef struct Graph_Node graph_node;
#endif
this is the method to insert data into graph variable:
void insert_node(graph_node** node, graph_node* data){
printf("\nINSERTING\n");
graph_node* temp = (graph_node*)malloc(sizeof(graph_node));
for(int i = 0; i < strlen(data->id); i++)
temp->id[i] = data->id[i];
for(int i = 0; i < strlen(data->node_name) - 1; i++)
temp->node_name[i] = data->node_name[i];
temp -> next = *node;
*node = temp;
}
this is the method to get the lines in the txt file bigger than 12 characters:
void generate_nodes(graph_node** graph, char* file_name){
graph_node* data = (graph_node*)malloc(sizeof(graph_node));
FILE* f_Data_Pointer = fopen(file_name, "r");
FILE* f_Aux_Pointer = fopen(file_name, "r");
char c = 0; char line[256];
int counter = 0;
int q = 0; //quotation marks
bool jump_line = false;
while(!feof(f_Data_Pointer)){
c = 0; memset(line, 0, sizeof(line));
while(c != '\n' && !feof(f_Aux_Pointer)){ // check line
c = fgetc(f_Aux_Pointer);
line[counter] = c;
counter++;
}
if(strlen(line) > 12){ //lines with no edges
/*line[counter-3] != '-' && line[counter-2] != '1'*/
int size = strlen(line); printf("\nline size: %d\n", size);
counter = 0; c = 0;
while(c != ','){ //id
c = fgetc(f_Data_Pointer);
if(c != ','){
data->id[counter] = c;
counter++;
}
printf("%c", c);
}
counter = 0; c = 0;
while(1){ //node_name
c = fgetc(f_Data_Pointer);
if(c != '"'){
data->node_name[counter] = c;
counter++;
}
else{
q++;
}
if(q > 1 && c == ',')
break;
printf("%c", c);
}
counter = 0; c = 0;
while(c != '\n'){
c = fgetc(f_Data_Pointer);
printf("%c", c);
}
insert_node(graph, data);
memset(data->id, 0, sizeof(data->id));
memset(data->node_name, 0, sizeof(data->node_name));
}
else{ //lines with edges
while(c != '\n' && !feof(f_Data_Pointer)){
c = fgetc(f_Data_Pointer);
}
}
}
fclose(f_Data_Pointer);
fclose(f_Aux_Pointer);
}
I am getting the errors inside the insert method on the commands "for" in "strlen" it says that data->id and data->node_name are not initialized but I don't understand why. I used malloc on data:
graph_node* data = (graph_node*)malloc(sizeof(graph_node));
error:
Conditional jump or move depends on uninitialised value(s) ==3612== at 0x4C30B18: strcpy (vg_replace_strmem.c:510) ==3612== by 0x4008B2: insert_node (mulval.c:44) ==3612== by 0x400C03: generate_nodes (mulval.c:159) ==3612== by 0x400CE8: main (mulval.c:187)
回答1:
The biggest problem in your code is that you are constantly ignoring the
'\0'
-terminating byte and pass to functions like strlen
which expects a
valid string (one that is 0-terminated).
For example in insert_node
:
for(int i = 0; i < strlen(data->id); i++)
temp->id[i] = data->id[i];
Here you are copying all characters expect the '\0'
-terminating byte,
temp->id
will store a sequence of characters but it is not a string.
strlen(data->id)
will have an undefined behaviour because data->id
is most
likely not 0-terminated if you initialized it without 0-terminating the string.
Here you should use strcpy
if you know that the source strings is smaller than
49 characters long, or strncpy
to be completely sure:
strncpy(temp->id, data->id, sizeof temp->id);
temp->id[sizeof(temp->id) - 1] = 0;
Same thing with node_name
. Also you are not checking if malloc
returned
NULL
.
Also the foef
lines are wrong, see Why while(!foef(...)) is always wrong.
You should rewrite this
while(c != '\n' && !feof(f_Aux_Pointer)){ // check line
c = fgetc(f_Aux_Pointer);
line[counter] = c;
counter++;
}
like this:
int c; // <-- not char c
while((c = fgetc(f_Aux_Pointer)) != '\n' && c != EOF)
line[counter++] = c;
line[counter] = 0; // terminating the string!
fgetc
returns an int
, not a char
, it should be int
, otherwise the
comparison with EOF
will go wrong. And here you ignored to set the
0-terminating byte, the result was a sequence of characters, but not a string.
The next line if(strlen(line) ...
overflow because of this, as strlen
will
keep looking for the 0-terminating byte beyond the limits. For me is not clear
why you even check for EOF
, as you would keep reading when the outer while
loop resumes. If f_Aux_Pointer
got to the end, shouldn't the function just
return? I'm not really sure what you are doing there. Also there is no strategy
for when the newline is not in the first 49 read characters. I think you should
rethink your strategy here.
It also would have been easier to do:
if(fgets(line, sizeof line, f_Aux_Pointer) == NULL)
{
// error handling
return; // perhaps this?
}
line[strcspn(line, "\n")] = 0; // removing the newline
if(strlen(line) > 12)
...
Here
while(c != ','){ //id
c = fgetc(f_Data_Pointer);
if(c != ','){
data->id[counter] = c;
counter++;
}
printf("%c", c);
}
you have same error as above, you never set the \0-terminating byte. At the end
of the while
you should have data->id[counter] = 0;
. The same thing in the
next while
loop.
In generate_nodes
you don't need to allocate memory for the temporal
graph_node
object as insert_node
will create a copy anyway. You can do:
graph_node data;
...
data.id[counter] = c;
...
data.node_name[counter] = c;
...
insert_node(graph, &data);
data.id[0] = 0;
data.node_name[0] = 0;
and you will have one malloc
less to worry about.
EDIT
Because your IDs are always numerical, I'd change the struct to this:
struct Graph_Node{
int id;
char node_name[50];
struct Graph_Node* next;
struct Graph_Node* edges;
};
which would make life easier. If your conditions are true that only the lines you need are longer than 12 characters and that the lines you are interested are always of the same format (no spaces between the commas, second column is always quotes) as the ones posted in your answer, then you can parse it like this:
int generate_nodes(graph_node **graph, const char *file_name)
{
if(file_name == NULL || graph == NULL)
return 0; // return error
// no need to allocate memory for it
// if the insert_node is going to make a
// copy anyway
struct Graph_Node data = { .next = NULL, .edges = NULL };
FILE *fp = fopen(file_name, "r");
if(fp == NULL)
{
fprintf(stderr, "Error opening file %s: %s\n", file_name,
strerror(errno));
return 0;
}
// no line will be longer than 1024
// based on your conditions
char line[1024];
size_t linenmr = 0;
while(fgets(line, sizeof line, fp))
{
linenmr++;
// getting rid of the newline
line[strcspn(line, "\n")] = 0;
if(strlen(line) <= 12)
continue; // resume reading, skipt the line
char *sep;
long int id = strtol(line, &sep, 0);
// assuming that there is not white spaces
// before and after the commas
if(*sep != ',')
{
fprintf(stderr, "Warning, line %lu is malformatted, '<id>,' exepcted\n", linenmr);
continue;
}
data.id = id;
// format is: "....",
if(sep[1] != '"')
{
fprintf(stderr, "Warning, line %lu is malformatted, \"<string>\", exepcted\n", linenmr);
continue;
}
// looking for ",
char *endname = strstr(sep + 2, "\",");
if(endname == NULL)
{
fprintf(stderr, "Warning, line %lu is malformatted, \"<string>\", exepcted\n", linenmr);
continue;
}
// ending string at ",
// easier to do strcnpy
*endname = 0;
strncpy(data.node_name, sep + 2, sizeof data.node_name);
data.node_name[sizeof(data.node_name) - 1] = 0;
insert_node(graph, &data);
}
fclose(fp);
return 1;
}
Now the interesting bits are these:
char *sep;
long int id = strtol(line, &sep, 0);
// assuming that there is not white spaces
// before and after the commas
if(*sep != ',')
{
fprintf(stderr, "Warning, line %lu is malformatted, '<id>,' exepcted\n", linenmr);
continue;
}
strtol is a function that converts a number in a string to an actual
integer. This function is better than atoi
, because it allows you to convert
numbers in different bases, from binary to hexadecimal. The second advantage is
that it tells you where it stopped reading. This is great for error detection,
and I use this behviour to detect if the line has the correct format. If the
line has the correct format, a comma ,
must appear next to the number, I check
for this and if it's not true, I print an error message and skip the line and
continue reading the file.
The next if
checks if the next character is a quote "
, because based on your
file, the second argument is enclosed in quotes. Sadly the separtor of the
arguments is a comma which is also used as a normal character in the string.
This makes things a little bit more complex (you cannot use strtok
here).
Once again, I assume that the whole
argument ends in ",
. Note that this does not take into consideration escaped
quotes. A line like this:
3,"somefunc(a,b,\"hello\",d)","OR",0
would be parsed incorrectly. When ",
is found, I set the quote to '\0'
so
that it is more easier to use strncpy
, otherwise I would have to calculate the
length of the string, check that it's length doesn't surpass the destination size, etc.
If you need to keep parsing, endname + 1
should point to the next comma,
otherwise the format is wrong.
strncpy(data.node_name, sep + 2, sizeof data.node_name);
data.node_name[sizeof(data.node_name) - 1] = 0;
Here I copy the string. The source is sep + 2
, because sep
points to the
first comma, so you have to skip it. The next character is the quote, so you
have to skip it as well, hence sep + 2
. Because the legth of the name is
unknown, it's better to use strncpy
, this will ensure that you don't write more
bytes than you need.
The last step is to insert the node into the graph. Note that I didn't allocate
memory, because I know that insert_node
is a going to make a new copy anyway.
Because the data
is only used temporarily, you don't need to allocate memory,
just pass a pointer to data
(by passing &data
) to insert_node
.
I've tested this function with a dummy graph
and a dummy insert_node
that
only prints the node:
void insert_node(graph_node** node, graph_node* data)
{
printf("ID: %d, node_name: %s\n", data->id, data->node_name);
}
and this is the output:
ID: 1, node_name: execCode(workStation,root)
ID: 2, node_name: RULE 4 (Trojan horse installation)
ID: 3, node_name: accessFile(workStation,write,'/usr/local/share')
ID: 4, node_name: RULE 16 (NFS semantics)
ID: 5, node_name: accessFile(fileServer,write,'/export')
ID: 6, node_name: RULE 10 (execCode implies file access)
ID: 7, node_name: canAccessFile(fileServer,root,write,'/export')
Now if you use this code and you keep getting valgrind errors, then that means that you still have errors in other parts of your code that you haven't shown us.
来源:https://stackoverflow.com/questions/48775429/why-am-i-getting-segmentation-fault-on-the-code-below