21-06-2012, 04:10 PM
SFST Manual
SFST Manua.pdf (Size: 136.22 KB / Downloads: 39)
Introduction
The Stuttgart Finite State Transducer (SFST) tools are a collection of software tools for
the generation, manipulation and processing of nite-state automata and transducers. A
nite state transducer (FST) maps strings from one regular language (surface language) onto
strings from another regular language (analysis language). One important application of
FSTs is morphological analysis, where a word form such as translations might be mapped
to the analysis string translate<V>ion<N><pl>. The mapping between surface strings
and analysis strings is reversible. The same transducer can be used to generate (i) analyses
for a surface form (in analysis mode) and (ii) to generate surface forms for an analysis (in
generation mode). The number of generated output strings is non necessarily 1, but can be
anywhere between 0 and innite.
The SFST Programming Language
It denes a transducer which maps the string Hello world! onto itself and rejects any other
input. The blank and the exclamation mark have to be quoted with a backslash because unquoted
blank and tab characters are ignored and unquoted exclamation marks are interpreted
as negation operators.
Because the output of the previous transducer is always identical to its input, it is equivalent
to a nite state automaton. The following expression species a real transducer which (in
generation mode) maps a string of a's, b's, and c's to a string where the c's are unchanged
and the a's have been replaced with b's and vice versa.
Include Command
Complex computer programs are usually stored in a set of les rather than a single le, and
the compiler combines these les to a single program. The same can be done with SFST
programs. The command #include "file.fst" instructs the compiler to insert the contents
of the le le.fst at the current position.
SFST programs create complex transducers by combining simpler transducers. If the compilation
of some component transducer is expensive and the respective source code is seldom
modied, it is useful to pre-compile this transducer. To this end, a separate SFST program
has to be written which implements the component transducer. This program is compiled
and the resulting transducer is stored e.g. in a le named inc.a. The main program reads
the precompiled transducer with the command "<inc.a>".