Can someone please tell me what a finite state transducer is?
I have read the Wikipedia article and don\'t understand a thing.
A finite state transducer essentially is a finite state automaton that works on two (or more) tapes. The most common way to think about transducers is as a kind of ``translating machine''. They read from one of the tapes and write onto the other. This, for instance, is a transducer that translates
a
s intob
s:
a:b
at the arc means that in this transition the transducer readsa
from the first tape and writesb
onto the second.
Reference: Finite State Transducers
A finite state transducer (FST) is a finite state automaton (FSA, FA) which produces output as well as reading input, which means it is useful for parsing (while a "bare" FSA can only be used for recognizing, i.e. pattern matching).
An FST consists of a finite number of states which are linked by transitions labeled with an input/output pair. The FST starts out in a designated start state and jumps to different states depending on the input, while producing output according to its transition table.
FSTs are useful in NLP and speech recognition because they have nice algebraic properties, most notably that they can be freely combined (form an algebra) under composition, which implements relational composition on regular relations (think of this as non-deterministic function composition) while staying very compact. FSTs can do parsing of regular languages into strings in linear time.
As an example, I once implemented morphological parsing as a bunch of FSTs. My main FST for verbs would turn a regular verb, say "walked", into "walk+PAST". I also had an FST for the verb "to be", which would turn "is" into "be+PRESENT+3rd" (3rd person), and similarly for other irregular verbs. All the FSTs were combined into a single one using an FST compiler, which produced a single FST that was much smaller than the sum of its parts and ran very fast. FSTs can be built by a variety of tools that accept an extended regular expression syntax.
In as simple terms as possible, I understand that an FST is essentially a "thing" that moves from one state to the next based on an input tape and writes to a different output tape. A tape is essentially a set of inputs like characters in a string.
The entire FST is represented by a set of states and links between them. A link is "activated" when its input condition is correct and then gives then next state the adjusted tape.
For example let's say an FST starts with the tape abc
at state 1. A link to state 2 matches a
and changes that to b
. This would get activated, set the output tape to just b
, and pass the remaining bc
to state 2. As you can see, each state is only activated if there is a link to it whose input condition was correct, passes the remaining input to the next state, and writes to a separate output tape. Each FST runs across the tape once and output to another tape once.
To get a more clear understanding of them read and take a look at the diagrams in this article (original broken link).