问题
I'm working on an end-of-semester project for a Programming languages course. The assignment is given below. I'm finishing writing it in Java and I'm having a lot of trouble writing in Prolog. I've been having a lot of trouble with Prolog so this question is as much looking for help with the assignment as it is trying to understand Prolog more. Any help that I can get would be GREATLY appreciated
A sentence contains words, all occurring in a dictionary, that happen to be concatenated without white spaces as separators. Describe a solution that produces all possible answers, compatible with a given dictionary in 2 out of the following 3 languages: Java, Haskell, Prolog. The test data is provided as a UTF-8 text file containing one sentence per line, with all words occurring in the dictionary, provided as a UTF-8 text file with one word on each line. The output should be a UTF-8 text file containing the sentences with all words separated by white spaces.
Example of word file:
cat
dog
barks
runs
the
awayan example of sentence file is
thedogbarks
thecatrunsaway
回答1:
The core of your program should be a predicate that tokenizes a list of character codes, i.e. builds a list of atoms (= words) out of the codes. Below is an outline:
%% tokenize(+Codes:list, -Atoms:list)
%
% Converts a list of character codes
% into a list of atoms. There can be several solutions.
tokenize([], []) :- !.
tokenize(Cs, [A | As]) :-
% Use append/3 to extract the Prefix of the code list
append(...),
% Check if the prefix constitutes a word in the dictionary,
% and convert it into an atom.
is_word(Prefix, A),
% Parse the remaining codes
tokenize(...).
You can now define:
is_word(Codes, Atom) :-
atom_codes(Atom, Codes),
word(Atom).
word(the).
word(there).
word(review).
word(view).
split_words(Sentence, Words) :-
atom_codes(Sentence, Codes),
tokenize(Codes, Words).
and use it like this:
?- split_words('thereview', Ws).
Ws = [the, review] ;
Ws = [there, view] ;
false.
or use it in something more complex where you parse a file to obtain the input and output the results into a file.
来源:https://stackoverflow.com/questions/5852841/splitting-a-sentence-without-any-whitespace-seperators-into-a-sentence-with-whit