Finding unique elements in an string array in C

后端 未结 5 2160
轻奢々
轻奢々 2021-01-05 16:39

C bothers me with its handling of strings. I have a pseudocode like this in my mind:

char *data[20]; 

char *tmp; int i,j;

for(i=0;i<20;i++) {
  tmp = da         


        
相关标签:
5条回答
  • 2021-01-05 16:44
    char *data[20];
    int i, j, n, unique[20];
    
    n = 0;
    for (i = 0; i < 20; ++i)
    {
        for (j = 0; j < n; ++j)
        {
            if (!strcmp(data[i], data[unique[j]]))
               break;
        }
    
        if (j == n)
            unique[n++] = i;
    }
    

    The indexes of the first occurrence of each unique string should be in unique[0..n-1] if I did that right.

    0 讨论(0)
  • 2021-01-05 16:44

    Might it be that your test is if (strcmp (this, that)) which will succeed if the two are different? !strcmp is probably what you want there.

    0 讨论(0)
  • 2021-01-05 16:56

    Why are you starting second loop from 1?

    You should start it from i+1. i.e.

    for(j=i+1;j<20;j++) 
    

    Like if the list is

    abc
    def
    abc
    abc
    lop
    

    then

    when i==4

    tmp="lop"

    but then the second loop starts which is from 1 to 19. This means it will get a value of 4 too at one stage, and then

    data[4], which is "lop", will be same as tmp. So although "lop" is unique but it will be flagged as repeated.

    Hope it was helpful.

    0 讨论(0)
  • 2021-01-05 17:04

    You could use qsort to force the duplicates next to each other. Once sorted, you only need to compare adjacent entries to find duplicates. The result is O(N log N) rather than (I think) O(N^2).

    Here is the 15 minute lunchtime version with no error checking:

      typedef struct {
         int origpos;
         char *value;
      } SORT;
    
      int qcmp(const void *x, const void *y) {
         int res = strcmp( ((SORT*)x)->value, ((SORT*)y)->value );
         if ( res != 0 )
            return res;
         else
            // they are equal - use original position as tie breaker
            return ( ((SORT*)x)->origpos - ((SORT*)y)->origpos );
      }
    
      int main( int argc, char* argv[] )
      {
         SORT *sorted;
         char **orig;
         int i;
         int num = argc - 1;
    
         orig = malloc( sizeof( char* ) * ( num ));
         sorted = malloc( sizeof( SORT ) * ( num ));
    
         for ( i = 0; i < num; i++ ) {
            orig[i] = argv[i + 1];
            sorted[i].value = argv[i + 1];
            sorted[i].origpos = i;
            }
    
         qsort( sorted, num, sizeof( SORT ), qcmp );
    
         // remove the dups (sorting left relative position same for dups)
         for ( i = 0; i < num - 1; i++ ) {
            if ( !strcmp( sorted[i].value, sorted[i+1].value ))
               // clear the duplicate entry however you see fit
               orig[sorted[i+1].origpos] = NULL;  // or free it if dynamic mem
            }
    
         // print them without dups in original order
         for ( i = 0; i < num; i++ )
            if ( orig[i] )
               printf( "%s ", orig[i] );
    
         free( orig );
         free( sorted );
      }
    
    0 讨论(0)
  • 2021-01-05 17:08

    Think a bit more about your problem -- what you really want to do is look at the PREVIOUS strings to see if you've already seen it. So, for each string n, compare it to strings 0 through n-1.

    print element 0 (it is unique)
    for i = 1 to n
      unique = 1
      for j = 0 to i-1 (compare this element to the ones preceding it)
        if element[i] == element[j]
           unique = 0
           break from loop
      if unique, print element i
    
    0 讨论(0)
提交回复
热议问题