Splitting a sentence without any whitespace/seperators into a sentence with whitespace

我们两清 提交于 2019-12-06 15:17:38

The core of your program should be a predicate that tokenizes a list of character codes, i.e. builds a list of atoms (= words) out of the codes. Below is an outline:

%% tokenize(+Codes:list, -Atoms:list)
%
% Converts a list of character codes
% into a list of atoms. There can be several solutions.
tokenize([], []) :- !.

tokenize(Cs, [A | As]) :-
    % Use append/3 to extract the Prefix of the code list
    append(...),
    % Check if the prefix constitutes a word in the dictionary,
    % and convert it into an atom.
    is_word(Prefix, A),
    % Parse the remaining codes
    tokenize(...).

You can now define:

is_word(Codes, Atom) :-
    atom_codes(Atom, Codes),
    word(Atom).

word(the).
word(there).
word(review).
word(view).

split_words(Sentence, Words) :-
    atom_codes(Sentence, Codes),
    tokenize(Codes, Words).

and use it like this:

?- split_words('thereview', Ws).
Ws = [the, review] ;
Ws = [there, view] ;
false.

or use it in something more complex where you parse a file to obtain the input and output the results into a file.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!