How to calculate possible word subsequences matching a pattern?

南楼画角 提交于 2019-12-10 21:15:58

问题


Suppose I have a sequence:

    Seq = 'hello my name'

and a string:

    Str = 'hello hello my friend, my awesome name is John, oh my god!'

And then I look for matches for my sequence within the string, so I get the "word" index of each match for each word of the sequence in a cell array, so the first element is a cell containing the matches for 'hello', the second element contains the matches for 'my' and the third for 'name'.

    Match = {[1 2];      %'hello' matches
             [3 5 11];   %'my' matches
             [7]}        %'name' matches

I need code to somehow get an answer saying that possible sub-sequence matches are:

    Answer = [1 3 7;     %[hello my name]
              1 5 7;     %[hello my name]
              2 3 7;     %[hello my name]
              2 5 7;]    %[hello my name]

In such a way that "Answer" contains all possible ordered sequences (that's why my(word 11) never appears in "Answer", there would have to be a "name" match after position 11.

NOTE: The length and number of matches of "Seq" may vary.


回答1:


Since the length of Matches may vary, you need to use comma-separated lists, together with ndgrid to generate all combinations (the approach is similar to that used in this other answer). Then filter out combinations where the indices are not increasing, using diff and logical indexing:

cc = cell(1,numel(Match)); %// pre-shape to be used for ndgrid output
[cc{end:-1:1}] = ndgrid(Match{end:-1:1}); %// output is a comma-separated list
cc = cellfun(@(v) v(:), cc, 'uni', 0) %// linearize each cell
combs = [cc{:}]; %// concatenate into a matrix
ind = all(diff(combs.')>0); %'// index of wanted combinations
combs = combs(ind,:); %// remove unwanted combinations

The desired result is in the variable combs. In your example,

combs =
     1     3     7
     1     5     7
     2     3     7
     2     5     7


来源:https://stackoverflow.com/questions/21867681/how-to-calculate-possible-word-subsequences-matching-a-pattern

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!