问题
I have asked a question related to this program but now after much research and headbutting I am stuck...again.
I am trying to write a program that will take the user input and store it then print out all the unique words and the number of times they each occurred
for example
Please enter something: Hello#@Hello# hs,,,he,,whywhyto[then the user hits enter]
hello 2
hs 1
he 1
whywhyto 1
The above should be the output, of course whywhyto isn't a word but it doesn't matter in this case because I am assuming any pattern of letters separated by anything that isn't a letter (spaces, 0-9,#$(@ etc.) is considered a word. I need to use 2D arrays because I am not capable of using linked lists nor do I understand them yet.
This is all I have so far
#include <stdio.h>
#include <ctype.h>
int main()
{
char array[64];
int i=0, j, input;
printf("Please enter an input:");
input=fgetc(stdin);
while(input != '\n')
{
if(isalpha(input))
{
array[i]=input;
i++;
}
input=fgetc(stdin);
}
for(j=0;j<i;j++)
{
// printf("%c ",j,array[j]);
printf("%c",array[j]);
}
printf("\n");
}
I am using isalpha
to get only letters but all this does is it gets rid of anything that isn't a letter, stores it and then prints back, but I have not a clue on how to get it to store words once for their first occurrence and then just increment a count for each word. I can only use fgetc() which is hard for me at least, I only have about 3-4 months of C experience, I know I will have to use 2 dimensional arrays, have been reading up on them but I have not been able to comprehend how I will implement them please help me out a bit.
回答1:
Here is code that seems to work:
#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
enum { MAX_WORDS = 64, MAX_WORD_LEN = 20 };
int main(void)
{
char words[MAX_WORDS][MAX_WORD_LEN];
int count[MAX_WORDS] = { 0 };
int w = 0;
char word[MAX_WORD_LEN];
int c;
int l = 0;
while ((c = getchar()) != EOF)
{
if (isalpha(c))
{
if (l < MAX_WORD_LEN - 1)
word[l++] = c;
else
{
fprintf(stderr, "Word too long: %*s%c...\n", l, word, c);
break;
}
}
else if (l > 0)
{
word[l] = '\0';
printf("Found word <<%s>>\n", word);
assert(strlen(word) < MAX_WORD_LEN);
int found = 0;
for (int i = 0; i < w; i++)
{
if (strcmp(word, words[i]) == 0)
{
count[i]++;
found = 1;
break;
}
}
if (!found)
{
if (w >= MAX_WORDS)
{
fprintf(stderr, "Too many distinct words (%s)\n", word);
break;
}
strcpy(words[w], word);
count[w++] = 1;
}
l = 0;
}
}
for (int i = 0; i < w; i++)
printf("%3d: %s\n", count[i], words[i]);
return 0;
}
Sample output:
$ ./wordfreq <<< "I think, therefore I am, I think, or maybe I do not think after all, and therefore I am not."
Found word <<I>>
Found word <<think>>
Found word <<therefore>>
Found word <<I>>
Found word <<am>>
Found word <<I>>
Found word <<think>>
Found word <<or>>
Found word <<maybe>>
Found word <<I>>
Found word <<do>>
Found word <<not>>
Found word <<think>>
Found word <<after>>
Found word <<all>>
Found word <<and>>
Found word <<therefore>>
Found word <<I>>
Found word <<am>>
Found word <<not>>
5: I
3: think
2: therefore
2: am
1: or
1: maybe
1: do
2: not
1: after
1: all
1: and
$ ./wordfreq <<< "I think thereforeIamIthinkormaybeI do not think after all, and therefore I am not."
Found word <<I>>
Found word <<think>>
Word too long: thereforeIamIthinkor...
1: I
1: think
$ ./wordfreq <<< "a b c d e f g h i j k l m n o p q r s t u v w x y z
> A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
> aa ab ac ad ae af ag ah ai aj ak al am
> an ao ap aq ar as at au av aw ax ay az
> "
Found word <<a>>
Found word <<b>>
Found word <<c>>
Found word <<d>>
Found word <<e>>
Found word <<f>>
Found word <<g>>
Found word <<h>>
Found word <<i>>
Found word <<j>>
Found word <<k>>
Found word <<l>>
Found word <<m>>
Found word <<n>>
Found word <<o>>
Found word <<p>>
Found word <<q>>
Found word <<r>>
Found word <<s>>
Found word <<t>>
Found word <<u>>
Found word <<v>>
Found word <<w>>
Found word <<x>>
Found word <<y>>
Found word <<z>>
Found word <<A>>
Found word <<B>>
Found word <<C>>
Found word <<D>>
Found word <<E>>
Found word <<F>>
Found word <<G>>
Found word <<H>>
Found word <<I>>
Found word <<J>>
Found word <<K>>
Found word <<L>>
Found word <<M>>
Found word <<N>>
Found word <<O>>
Found word <<P>>
Found word <<Q>>
Found word <<R>>
Found word <<S>>
Found word <<T>>
Found word <<U>>
Found word <<V>>
Found word <<W>>
Found word <<X>>
Found word <<Y>>
Found word <<Z>>
Found word <<aa>>
Found word <<ab>>
Found word <<ac>>
Found word <<ad>>
Found word <<ae>>
Found word <<af>>
Found word <<ag>>
Found word <<ah>>
Found word <<ai>>
Found word <<aj>>
Found word <<ak>>
Found word <<al>>
Found word <<am>>
Too many distinct words (am)
1: a
1: b
1: c
1: d
1: e
1: f
1: g
1: h
1: i
1: j
1: k
1: l
1: m
1: n
1: o
1: p
1: q
1: r
1: s
1: t
1: u
1: v
1: w
1: x
1: y
1: z
1: A
1: B
1: C
1: D
1: E
1: F
1: G
1: H
1: I
1: J
1: K
1: L
1: M
1: N
1: O
1: P
1: Q
1: R
1: S
1: T
1: U
1: V
1: W
1: X
1: Y
1: Z
1: aa
1: ab
1: ac
1: ad
1: ae
1: af
1: ag
1: ah
1: ai
1: aj
1: ak
1: al
$
The test for 'word too long' and 'too many words' help reassure me that the code is sound. Devising such tests is good practice.
回答2:
Don't know if this is homework or not so I did not do everything for you, I also cleaned up your code a little bit. But pretty much if you don't have a knowledge of how many words the person may input you need a dynamic data structure such as a linkedlist
#include <stdio.h>
#include <string.h>
#include <ctype.h>
typedef struct linkedlist linkedlist;
struct linkedlist{
char *word;
int count;
linkedlist *next;
};
int main()
{
//know your bounds, this will cause trouble if word is longer than 64 chars
char array[64];
int i=0, input;
linkedlist *head = NULL;
printf("Please enter an input:");
while((input=fgetc(stdin)) != '\n')
{
if(isalpha(input) && i!=63) //added this so that code does not brake (word is 64 chars)
{
array[i]=input;
}
else{
array[i]='\0';
char *word = malloc(strlen(array)+1);
strcpy(word, array);
add_word(word, &head);
i=0; //need to restart i to keep reading words
}
i++;
}
//print out final results
for(linkedlist *temp = head; temp != NULL; temp = temp->next){
printf("%s %d ", temp->word, temp->count);
}
}
//adds word to end of list if does not exist
//increments word count if it exists
void add_word(char *word, linkedlist **ll){
//implement this
}
//frees resources used by malloc (lookup how to free a linkedlist/destroy a linked list
//make sure to free both final and head in main
void destroy_list(linkedlist **ll){
//implement this
}
For add_word you will need something along the lines of (PSEUDO-CODE):
list = *ll
if(list == NULL): //new list
*ll = malloc(sizeof(linkedlist))
ll->word = word
ll->count = 1
ll->next = NULL
return
while list->next != null:
if word = list->word:
free(word)
list->count++
return
list = list->next
if list->word = word: //last word in list
free(word)
list->count++
else: //word did not exist, add new word to end of list
temp = malloc(sizeof(linkedlist))
temp->word = word
temp->count = 1
list->next = temp
Maybe not the most efficient way but you can improve upon it Hope I did not confuse you further, good luck
回答3:
OP still has a far amount of work ahead.
This trick is to 1) read input 2) identify delimiters 3) compare words to entire buffer and 4) print them only once.
This approach is memory efficient as it only used the 64 char
buffer suggested by OP. The search complexity is O(n*n)
#include <ctype.h>
#include <stdio.h>
#include <string.h>
// Helper function to find word occurrences.
void Print_count(const char *word, const char *array, int i) {
int count = 0;
const char *found;
for (int j = 0; j < i; j++) {
if (isalpha((unsigned char ) array[j])) {
if (strcmp(&array[j], word) == 0) {
found = &array[j];
count++;
}
// skip rest of word
do {
j++;
} while (isalpha((unsigned char ) array[j]));
}
}
if (found == word) {
printf("%s %d\n", word, count);
}
}
int main(void) {
char array[64];
int i = 0;
int j;
int input;
printf("Please enter an input:");
// get the input
while ((input = fgetc(stdin)) != '\n' && input != EOF) {
array[i] = input;
if (i + 1 >= sizeof array)
break;
i++;
}
array[i] = '\0';
// change all delimiters to \0
for (j = 0; j < i; j++) {
if (!isalpha((unsigned char ) array[j])) {
array[j] = '\0';
}
}
for (j = 0; j < i; j++) {
// Use the beginning of each word ...
if (isalpha((unsigned char ) array[j])) {
Print_count(&array[j], array, i);
// skip test of word
do {
j++;
} while (isalpha((unsigned char ) array[j]));
}
}
return 0;
}
Input Hello#@Hello# hs,,,he,,whywhyto
Output:
Hello 2
hs 1
he 1
whywhyto 1
来源:https://stackoverflow.com/questions/27085900/how-to-count-unique-number-of-words-using-fgetc-then-printing-the-count-in-c