Comparing strings in cell arrays

安稳与你 提交于 2019-12-24 04:02:16

问题


I'm trying to find the most frequent word in a list of words. Here is my code so far:

uniWords = unique(lower(words));
for i = 1:length(words)
    for j = 1:length(uniWords)
        if (uniWords(j) == lower(words(i)))
            freq(j) = freq(j) + 1;
        end
    end
end

When I try to run the script, I get the following error:

Undefined function 'eq' for input arguments of
type 'cell'.

Error in Biweekly3 (line 106)
    if (uniWords(j) == lower(words(i)))

Any help is appreciated!


回答1:


You need to extract the contents of the cell with {}:

strcmpi(uniWords{j},words{i})

Also, I suggest comparing strings with strcmp or in this case strcmpi, which ignores case so you do not need to call lower.

Be careful when using == on strings because they must be the same length or you will get an error:

>> s1='first';
>> s2='second';
>> s1==s2
Error using  == 
Matrix dimensions must agree. 



回答2:


No need for loops. unique gives you a unique identifier for each word, and you can then sum occurrences of each identifier with sparse. From that you easily find the maximum, and the maximizing word(s):

[~, ~, jj ] = unique(lower(words));
freq = full(sparse(ones(1,length(jj)),jj,1)); % number of occurrences of each word
m = max(freq);
result = lower(words(jj(freq==m))); % return more than one word if there's a tie

For example, with

words = {'The','hello','one','bye','the','one'}

the result is

>> result

result = 

    'one'    'the'



回答3:


I guess you need to do:

if (uniWords{j} == lower(words{i}))

Also, I suggest not using i and j as variables in MATLAB.

Update

As Chappjc points out, it is better to use strcmp (or in your case strcmpi and skip lower), since you want to ignore cases.



来源:https://stackoverflow.com/questions/19870258/comparing-strings-in-cell-arrays

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!