Getting the longest common subsequence in ERLANG

百般思念 提交于 2019-12-10 19:13:13

问题


I'm new to this ERLANG, I know the basics. It's like scheme but broader. I know how to create a function but I'm having problems with creating a function that gets the Longest Common Subsequence.

lcs(str1,str2) is a function that will accept two string and output an integer:

lcs(algorithm,logarithm) will output 7 because the longest common subsequence that can be made is lgrithm which is 7 in size.

Any answer is greatly appreciated.


回答1:


You have a pretty good implementation of an LCS algorithm available on Rosettacode, which is:

-module(lcs).
-compile(export_all).

lcs_length(S,T) ->
    {L,_C} = lcs_length(S,T,dict:new()),
    L.

lcs_length([]=S,T,Cache) ->
    {0,dict:store({S,T},0,Cache)};
lcs_length(S,[]=T,Cache) ->
    {0,dict:store({S,T},0,Cache)};
lcs_length([H|ST]=S,[H|TT]=T,Cache) ->
    {L,C} = lcs_length(ST,TT,Cache),
    {L+1,dict:store({S,T},L+1,C)};
lcs_length([_SH|ST]=S,[_TH|TT]=T,Cache) ->
    case dict:is_key({S,T},Cache) of
        true -> {dict:fetch({S,T},Cache),Cache};
        false ->
            {L1,C1} = lcs_length(S,TT,Cache),
            {L2,C2} = lcs_length(ST,T,C1),
            L = lists:max([L1,L2]),
            {L,dict:store({S,T},L,C2)}
    end.

lcs(S,T) ->
    {_,C} = lcs_length(S,T,dict:new()),
    lcs(S,T,C,[]).

lcs([],_,_,Acc) ->
    lists:reverse(Acc);
lcs(_,[],_,Acc) ->
    lists:reverse(Acc);
lcs([H|ST],[H|TT],Cache,Acc) ->
    lcs(ST,TT,Cache,[H|Acc]);
lcs([_SH|ST]=S,[_TH|TT]=T,Cache,Acc) ->
    case dict:fetch({S,TT},Cache) > dict:fetch({ST,T},Cache) of
        true ->
            lcs(S,TT,Cache, Acc);
        false ->
            lcs(ST,T,Cache,Acc)
    end.

Used as:

raringbeast:Code pierre $ erl
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe]
[kernel-poll:false]

Eshell V5.9.1  (abort with ^G)
1> c(lcs).
{ok,lcs}
2> lcs:lcs("logarithm", "algorithm").
"lgrithm"
3> lcs:lcs_length("logarithm", "algorithm").
7

--

Edit: Made the algorithm a little easier to understand

The cache here is just an interesting way to improve algorithm performance in some cases, but this isn't required here. We can just write, removing the cache:

lcs_length([],_T) ->
    0;
lcs_length(_S,[]) ->
    0;
lcs_length([_H|ST],[_H|TT]) ->
    1 + lcs_length(ST,TT);
lcs_length([_SH|ST]=S,[_TH|TT]=T) ->
    L1 = lcs_length(S,TT),
    L2 = lcs_length(ST,T),
    lists:max([L1,L2]). 

To sum up:

  1. The LCS of "" and anything is 0.
  2. Same for the LCS of anything and "".
  3. The LCS of two words starting by the same letter is the LCS of the two words minus the first letter plus 1.
  4. The LCS of two words starting by different letters is the max of (a) the LCS of the first word minus the first letter and the second, and (b) the LCS of the first word and the second minus the first letter.


来源:https://stackoverflow.com/questions/18920564/getting-the-longest-common-subsequence-in-erlang

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!