The edges printed does not match the nodes

心不动则不痛 提交于 2020-01-15 11:28:25

问题


I have a program that reads a file with two columns of numbers, sorts them, creates three tables, one with only the nodes (individually), one with all the edges and one that has the amount of edges for every node. The problem is that when I try to print the edges, it prints them wrong or it says it cannot find them. Through some gdb I found out that the first arrays are fine but the third stores a bunch of random numbers (or zeros) through the end. Any help would be appreciated. The file looks like this (start/end node for each edge):

7856    8192
7754    7005
7862    1982
7862    3293
7862    4037
7862    5210
7862    5605
7862    7860

The code looks like this:

#include<stdio.h>
#include<stdlib.h>
#include<string.h>



int mapcmp(const void *a,const void *b){
  return ( *(int*)a - *(int*)b );
}

int mapdoublesize(int** map,int nodes){
    int* new_array=malloc(nodes*2*sizeof(int));
    if(new_array==NULL){
        printf("Error allocating memory\n");
        abort();
    }
    nodes*=2;
    for(int i=0;i<nodes;i++){
        new_array[i]=(*map)[i];
    }

    free(*map);
    *map=new_array;
    return nodes;
}

typedef struct {
    int start;
    int end;   
} path;

int cmp(const void *a,const void *b){
    int l=((path*)a)->start;
    int r=((path*)b)->start;

    if(l>r)
        return 1;
    if(l<r)
        return -1;
    if(l==r)
        return 0;
  }

int doublesize(path** array,int n){
    path* new_array=malloc(n*2*sizeof(path));
    if(new_array==NULL){
        printf("Error allocating memory\n");
        abort();
    }

    for(int i=0;i<n;i++){
        new_array[i]=(*array)[i];
    }
    free(*array);
    *array=new_array;
    n*=2;
    return n;

}


int main()
{
    int maxsize=10;
    int test;
    path* array=malloc(maxsize*sizeof(path));
    if(array==NULL) {
        printf("Error allocating memory\n");
        abort();
    }


    FILE* fd=fopen("Wiki-Vote.txt","r");
    if(fd==NULL) {
        printf("Error opening file\n");
        abort();
    }
    char buff[200];
    int counter=0;

    char c;
  while(fgets(buff,200,fd)) {

        c=buff[0];
        if(c=='#') {
            continue;
        }
    sscanf(buff,"%d%d",&array[counter].start,&array[counter].end);
        counter++;
        if(counter==maxsize){
           maxsize=doublesize(&array,maxsize); 
    }

    }
  int i;
  maxsize=counter;
    counter=0;
    qsort(&array[0],maxsize,sizeof(path),cmp);




  counter=0;
  int nodes=10;
  int* map=malloc(nodes*sizeof(int));
  if(map==NULL){
    printf("Error allocating memory\n");
    abort();
  }

for(i=0;i<maxsize;i++){
  if(map[counter-1]==array[i].start)
    continue;
        map[counter]=array[i].start;
        counter++;
        if(counter==nodes){
          nodes=mapdoublesize(&map,nodes);
        }
}
int j;
for(i=0;i<maxsize;i++){
  for(j=0;j<counter;j++){
    if(map[j]==array[i].end)
      break;
  }
  if(j!=counter)
    continue;
  map[counter]=array[i].end;
  counter++;
  if(counter==nodes)
    nodes=mapdoublesize(&map,nodes);
}

nodes=counter;
qsort(&map[0],nodes,sizeof(int),mapcmp);


  int* arraynodes=malloc(nodes*sizeof(int));
  int* arrayedges=malloc(maxsize*sizeof(int));
  if(arraynodes==NULL||arrayedges==NULL){
    printf("Error allocating memory\n");
    abort();
  }
  counter=1;

  arraynodes[0]=0;
  for(i=0;i<maxsize;i++){
    arrayedges[i]=array[i].end;
    if(array[i].start!=array[i+1].start){
      arraynodes[counter]=i;
      counter++;
    }
  }


int x;
  printf("give number to search: "); 
  scanf("%d",&x);
  for(i=0;i<nodes;i++){
    if(x==map[i]){
      printf("found \n");
      break;
    }

  }
  if(i==nodes){
    printf("not found \n");
    abort();
  }

  for(j=arraynodes[i];j<arraynodes[i+1];j++){
    printf("%d\n",arrayedges[j]);
  }

  free(arraynodes);
  free(arrayedges);
  free(map);
    fclose(fd);
    free(array);
        return 0;
}


回答1:


Core answer:

As I understand your intention, you want arraynodes to hold for each node index the offset in the edge list where the edges for that node start.

You iterate over the edge list and every time the starting point changes, you store the current offset in arraynodes. This is flawed, because not all nodes are the starting point of an edge. So if your edge list has an edge from node 5 -> 7 and then an edge from 6 -> 7 then you will register the change of the starting point from 5 to 6, but you will store the current offset at the beginning of arraynodes and not for the 5th node.

To fix this, instead do this: Keep an offset into the edge list, initially zero. Iterate over the nodes, for each node store the current offset into arraynodes. Then increment the offset as long as the starting point of the edge at the current offset is equal to the current node. This way arraynodes will tell you for each node index, at which index in the edge list the edges starting at this node are stored.

  /**
   * Assumption: Edges are sorted by their starting point.
   */

  int edge_count = maxsize;
  int edge_offset = 0;

  /**
   * For each node:
   *
   * - Store current edge_offset in arraynodes
   * - Increment edge_offset as long as the start point
  *    of the edge at that offset matches the current node.
   */
  for (int i = 0; i < nodes; i++) {
    int current_node = map[i];
    arraynodes[i] = edge_offset;

    while (edge_offset < edge_count && array[edge_offset].start == current_node) {
      edge_offset++;
    }
  }

  /**
   * Copy end-points of edges to arrayedges.
   *
   * You don't really need this, you could also directly
   * access the end-points in your output loop ...
   */
  for (int i = 0; i < edge_count; i++) {
    arrayedges[i] = array[i].end;
  }

Memory safety issues:

There are a several memory safety issues in your code:

  1. Buffer underflow: In the first pass of the loop, counter is zero, so map[counter-1] goes out of bounds.
  counter = 0;
  int nodes = 10;
  int *map = malloc(nodes * sizeof(int));
  if (map == NULL) {
    printf("Error allocating memory\n");
    abort();
  }

  for (i = 0; i < maxsize; i++) {
    if (map[counter - 1] == array[i].start)
      continue;
  1. Buffer overflow: When you initialize the map, you want to double its size when it's full. However, in mapdoublesize when you copy the data from the old map to the new map, you iterate over the whole new map, so the second half of this loop reads past the bounds of the old map:
  nodes *= 2;
  for (int i = 0; i < nodes; i++) {
    new_array[i] = (*map)[i];
  }
  1. Buffer overflow: In the last iteration of this loop: The access to array[i+1] is out of bounds:
  for (i = 0; i < maxsize; i++) {
    arrayedges[i] = array[i].end;
    if (array[i].start != array[i + 1].start) {
  1. Buffer overflow: In your output loop, if i is the last node, your access to arraynodes[i+1] goes out of bounds:
for (j = arraynodes[i]; j < arraynodes[i + 1]; j++) {

I do not guarantee that I found all memory safety issues. There might very well be more. I would advice you to improve the structuring and documentation of your program: Break down your program into smaller functions that do one step and document the assumptions and preconditions of this step (i.e. what are the bound of the array you are accessing?). Give the variables names that cleary describe their purpose, do not reuse variables. This should make it easier for you to spot these kinds of errors. Also I would advice you to use tools to check for memory safety issues. GCC and Clang both have a feature called ASAN that will automatically insert debug code into your binary that will detect and report memory safety issues when you run your program. You can enable this by compiling with -fsanitize=address (reference). Another tool with a similar scope would be Valgrind (reference). These programs cannot find all errors of course, since they only do dynamic analysis of the code that is actually executed. If there is a bug in some branch of your program that is not reached by the current execution, it will not be detected. So you still do not get around taking a careful look at your program.



来源:https://stackoverflow.com/questions/59721849/the-edges-printed-does-not-match-the-nodes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!