Comparing strings in cell arrays

问题

I'm trying to find the most frequent word in a list of words. Here is my code so far:

uniWords = unique(lower(words));
for i = 1:length(words)
    for j = 1:length(uniWords)
        if (uniWords(j) == lower(words(i)))
            freq(j) = freq(j) + 1;
        end
    end
end

When I try to run the script, I get the following error:

Undefined function 'eq' for input arguments of
type 'cell'.

Error in Biweekly3 (line 106)
    if (uniWords(j) == lower(words(i)))

Any help is appreciated!

回答1:

You need to extract the contents of the cell with {}:

strcmpi(uniWords{j},words{i})

Also, I suggest comparing strings with strcmp or in this case strcmpi, which ignores case so you do not need to call lower.

Be careful when using == on strings because they must be the same length or you will get an error:

>> s1='first';
>> s2='second';
>> s1==s2
Error using  == 
Matrix dimensions must agree.

回答2:

No need for loops. unique gives you a unique identifier for each word, and you can then sum occurrences of each identifier with sparse. From that you easily find the maximum, and the maximizing word(s):

[~, ~, jj ] = unique(lower(words));
freq = full(sparse(ones(1,length(jj)),jj,1)); % number of occurrences of each word
m = max(freq);
result = lower(words(jj(freq==m))); % return more than one word if there's a tie

For example, with

words = {'The','hello','one','bye','the','one'}

the result is

>> result

result = 

    'one'    'the'

回答3:

I guess you need to do:

if (uniWords{j} == lower(words{i}))

Also, I suggest not using i and j as variables in MATLAB.

Update

As Chappjc points out, it is better to use strcmp (or in your case strcmpi and skip lower), since you want to ignore cases.

来源：https://stackoverflow.com/questions/19870258/comparing-strings-in-cell-arrays

标签

string

matlab

compare

cell-array