How to check for repeated characters within a string in c

问题

I'm trying to create a program that checks for repeated characters within the command line argument's string. The string is suppose to contain only 26 characters, and all characters have to be alphabetical. However, there cannot be any repeated characters within the string, each alphabetical character must appear only once. I figured out the first two sections of the program but I cant figure out how to check for repeated characters. I could really use some help as well as explanations.

#include <stdio.h>
#include <cs50.h>
#include <string.h>
#include <ctype.h>

int main (int argc, string argv[])
{
    if (argc != 2)
    {
        printf("Usage: ./substitution key\n");
        return 1;
    }
    else
    {
        int len = strlen(argv[1]);
        if (len != 26)
        {
            printf("Key must contain 26 characters.\n");
            return 1;
        }
        else
        {
            for (int i = 0; i < len; i++)
            {
                if (!isalpha(argv[1][i]))
                {
                    printf("Usage: ./substitution key\n");
                    return 1;
                }
            }
        }
    }
}

回答1:

Here are the basics of a solution:

When starting, initialize a table of flags: char Seen[UCHAR_MAX+1] = {0};. In this array, Seen[c] will be true (non-zero) if and only if character c has been seen already. (To get UCHAR_MAX, #include <limits.h>.)

When processing each character, copy it to an unsigned char: unsigned char c = argv[1][i];. Then convert it to uppercase: c = toupper(c);. Then test whether it has been seen already:

if (Seen[c])
    Report error...

If it is new, remember that it has been seen: Seen[c] = 1;.

That is all that is necessary.

Notes:

If it is known that A to Z are contiguous in the character set, as they are in ASCII, then char Seen[UCHAR_MAX+1] can be reduced to char Seen['Z'-'A'+1] and indexed using Seen[c-'A']. (Actually, a weaker condition suffices: A is the lowest uppercase character in value and Z is the greatest.)
Do not be tempted to use unsigned char c = toupper(argv[1][i]), because toupper is defined for unsigned char values, and a char value may be out of bounds (negative).
In an esoteric C implementation where a char is as wide as an int, none of which are known to exist, UCHAR_MAX+1 would evaluate to zero. However, the compiler should warn if this happens.

回答2:

From the question, the expected input is 26 characters with each appears only 1 time.

According to Andreas Wenzel's idea, i add counters for each character. in case any character missing or duplicate will be spotted. Here's my solution:

int main(int argc, char *argv[])
{
    const int OFFSET = 'A';

    if (argc != 2)
    {
        return 1;
    }
    if (strlen(argv[1]) != 26)
    {
        printf("Key must contain 26 characters.\n");
        return 1;
    }

    unsigned char *key = argv[1];
    int count[26] = {0}; // array of all zero

    // Check for any invalid character 
    for (int i = 0, n = strlen(key); i < n; i++)
    {
        if (!isalpha(key[i]))
        {
            printf("Key contains invalid characters.\n");
            return 1;
        }
        count[toupper(key[i]) - OFFSET]++;
    }
    // check for duplicate character of 
    for (int i = 0; i < 26; i++)
    {
        if (count[i] != 1) 
        {
            printf("Key shouldn't contain duplicate characters\n");
            return 1;
        }
    }
)

回答3:

You can consider to use strchr(). I get an example from https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_72/rtref/strchr.htm

#include <stdio.h>
#include <string.h>

#define SIZE 40

int main(void) {
  char buffer1[SIZE] = "computer program";
  char * ptr;
  int    ch = 'p';

  ptr = strchr(buffer1, ch);
  printf("The first occurrence of %c in '%s' is '%s'\n",
          ch, buffer1, ptr );

}

/*****************  Output should be similar to:  *****************
The first occurrence of p in 'computer program' is 'puter program'
*/

回答4:

There is many ways to do it, but the one I like is to first sort the string and then go through it and delete or copy the string to another string/array and u can do that by

s = "bedezdbe"

after u sort it

s = "bbddeeez"

And to check it

if(s[i-1]!=s[i])

回答5:

I'm assuming you are not yet in Datastructures and Algorithms. If so I would do a binary insertion sort and when you find a repeat you just printf your error message and return a 1 like you have done for the rest of the errors.

That said, that might be over your head right now. So in that case, just go with a bruteforce method without regard for optimization.

Start with the first character, loop through the rest of the string to see if it matches.
If it matches, error message
if there is no match, then repeat for the next character.

The code below checks if there are any repeating characters by this brute force method mentioned. You'll need to adapt it to your code of course.

#include <stdio.h>
#include <string.h>
#include <stdbool.h>

bool itrepeats(char c, char *restofstring, int sublen){
    for(int j=0; j<sublen; j++){
        if(c==*restofstring) return true;
        restofstring++;
    }
    return false;
}

int main (int argc, char *argv[])
{
    int len = strlen(argv[1]);
    char *substr = argv[1];
    
    //loop through all characters
    for(int i=0;i<len;i++){
        //get a character
        char c = argv[1][i];
        //start of substring after c
        substr++;
        //length of that substring
        int substrlen = len-(i+1);
        //check for c in rest of string
        if(itrepeats(c, substr, substrlen)){
            printf("%c repeats, not allowed!\n",c);
            return 1;
        }
        else printf("all good!\n");
    }
}

回答6:

Your can first sort the characters in the string so that any repeated letters will be right next to each other. So after sorting, scan through your sorted string with a variable holding your currently scanned character. If the next character you scan is the same as the last one that you scanned, you have a repeat character

回答7:

You can use strchr() to detect if a given character is duplicated in the rest of the string:

#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
    if (argc != 2) {
        printf("Usage: ./substitution key\n");
        return 1;
    } else {
        char *key = argv[1];
        int len = strlen(key]);
        if (len != 26) {
            printf("Key must contain 26 characters.\n");
            return 1;
        } else {
            for (int i = 0; i < len; i++) {
                unsigned char c = key[i];
                if (!isalpha(c)) {
                    printf("The key must contain only letters.\n");
                    return 1;
                }
                if (strchr(key + i + 1, c) {
                    printf("The key must not contain duplicate letters.\n");
                    return 1;
                }
            }
        }
    }
    return 0;
}

回答8:

I'm assuming this was from cs50, I'm on this problem right now... What I did was I used a simple boolean function. I used two loops, the first loop was to represent the first character of reference and inside the nested loop I looped through all the other chars and returned false if i == k.

bool no_repeat(string arg_letters)
{
   for (int i = 0, j = strlen(arg_letters); i < j; i++)
   {
      for (int k = i+1; k < j; k++)
      {
         if (arg_letters[i] == arg_letters[k])
         {
            return false;
         }

      }
   }
   return true;
}

来源：https://stackoverflow.com/questions/62723673/how-to-check-for-repeated-characters-within-a-string-in-c

标签

cs50