Linguistics 484 - Grammars Ling484 Notes A. C. Brett
Department of Linguistics
University of Victoria
Clearihue C139
Last updated: 3 March 2001

Finite-State Automata

Automata are abstract machines that are normally conceived of as idealised, but much simplified computers. McCullough and Pitts (1943) are often identified as the inventors of the simplest of these machines, the Finite-State Automaton. The concepts introduced with the Finite-State Automaton played a significant role in the subsequent development of general-purpose computers. As the title of the McCullough and Pitts article suggests, however, as does also the title of an article by Kleene (1956), another significant early contributor to the theory of finite automata, these machines were initially thought to emulate the processing performed by networks of neurons. Finite-State Automata continue to be employed as models and in practical natural language processing applications. See, for example, Eisner et al. (2000), Karttunen and Oflazer (1998), Roche and Schabes (1997).

Finite-State Automata are important because of their equivalence to the Type 3 grammars of the Chomsky Hierarchy, a fact established by Chomsky and Miller (1958). The equivalence of a Finite-State Automaton to a Type 3 grammar means that, given a Type 3 grammar G, a Finite-State Automaton A can be devised that recognises just the sentences generated by G. In fact, the Type 3 grammars, which are also known as the regular or right-linear grammars, are often identified as the "finite grammars" because of their equivalence to finite automata.

Another variety of automaton, namely the Push-Down Automaton, is the subject of another note. The Push-Down Automaton differs from the Finite-State machine only in that it is equipped with a memory device that functions as a push-down store, which is often called a stack. The Finite-State Automaton itself does not include a memory device. Push-Down Automata are equivalent to context-free grammars, also known as the Chomsky Type 2 grammars, which means that, given a context-free grammar G, a Push-Down Automaton A can be devised that recognises just the sentences generated by G.

The Finite-State Automata and their equivalence to finite grammars are the subject of this note. The following topics are discussed:

A Finite-State Automaton is usually described as consisting of three components: with the nature and operations of these components being as follows:

Ordered Quintuple Specification. A Finite-State Automaton A can be specified by the entries in the following ordered quintuple:

A = Q, T, M, q0, H

Transition Function. The transition function of the automaton, which is also called its transition matrix or transition table, specifies the state into which the machine will move on the basis of its current state and the word that is read. Thus, the transition function M can be represented as a set of ordered triples with the form

qi, w, qj
where Elements of the transition function can also be represented in a rewriting rule format such as
qi w qj

Language of the Automaton. A string consisting of the n input symbols w1w2 ... wn is accepted by the automaton A if, for each symbol wi T, where i = 1 to n, there is a transition

qi-1, wi, qi M
with q0 Q being the starting state of A and qn H being an accepting state. Thus, there is a sequence of transitions such that, as the first symbol w1 of the string is read, the transition q0, w1, q1 shifts the machine out of its starting state, and as the last symbol wn is read, the last transition qn-1, wn, qn in the sequence shifts the machine into an accepting state. The set of all strings of input symbols accepted by an automaton A is called the language of the automaton, and is denoted L(A).

Reductions. The processing of a string of symbols by a finite-state automaton can be treated as something like the derivation of a sentence by a phrase-structure grammar. Rather than producing a sentence, however, the derivation process performed by a finite-state automaton actually consumes or reduces an input string. This process, called a reduction, is begun by concatenating an input string w1w2 ... wn T* to the starting state label q0 to obtain the string q0w1w2 ... wn.

A reduction of this string is then performed by applying transitions as if they were rewriting rules. Thus, the first transition, that shifts the automaton out of its starting state as it reads the first input symbol, has the form q0w1 q1. Application of this rule to the initial string replaces q0w1 with q1 to yield the string q1w2 ... wn. This process can be represented by writing

q0w1w2 ... wn q1w2 ... wn
where can be read as "reduces in one transition to." The reduction corresponding to the reading of the ith word wi of the input string during the transition qi-1wi qi can be represented by writing
qi-1wiwi+1 ... wn qiwi+1 ... wn
and the reduction during the transition qn-1wn qn as the last word wn is read can be represented as
qn-1wn qn
where qn H is an accepting state of the automaton. Consequently, given a string s T*, it is the case that s L(A) if and only if
q0s * qn
where * may be read as "reduces in finitely many transitions to." The language L(A) accepted by the finite-state automaton A may therefore be defined as follows:
L(A) = {s | s T*, q0s * qn, qn H}

Equivalence to Finite Grammars. Finite-state automata are said to be equivalent to finite (Chomsky Type 3, regular) grammars, which means that, given an arbitrary finite grammar G', a finite-state automaton A can be divised that accepts just the sentences generated by G'. In other words,

L(A) = L(G')
where L(A) is the language of the automaton A, and L(G') is the language generated by G'.

The construction of an equivalent finite-state automaton proceeds as follows. In the first place, recall that the rewriting rules of a regular or finite grammar have the general form

A aB
B b
where A and B denote nonterminals, while a and b represent strings of terminals symbols. Recall further that, given an arbitrary finite grammar G', an equivalent finite grammer G can be devised such that all its rules have one or the other of the following forms
A aB
B l
that is, the right-hands sides of the rules of G consist of a single terminal symbol followed by a nonterminal, or the right-hand side is the null string. If
G = VN, VT, S, P
then a finite-state automaton A where
A = Q, T, M, q0, H
that is equivalent to G can be specified according to the following equalities: Thus, the nonterminal symbols of the grammar G become the labels of the states of the automaton A, with the starting symbol S of G becoming the label of the starting state of A, while the terminal symbols of G become the input symbols of A. The accepting states of A correspond to, and are labelled by the left-hand side symbols of those rules of G with the null string on the right-hand side. Finally, for every other rule of G with the form A aB in its production set P, there is an element of the form Aa B in the transition function M of A.

Consequently, if s is a sentence generated by G, that is s L(G), then for every rule in P applied to generate s, there is a transition in M that can be applied to accept s. Hence, s L(A) so that L(G) L(A). Conversely, if s L(A), a sequence of rules can be found in P that matches the sequence of transitions in M that are executed to accept s. Hence, s L(G) so that L(A) L(G). Consequently, we have that

L(A) = L(G)

Example. Consider the following normal form finite grammar G with nonterminal and terminal vocabularies

VN = {S, VPs, VPp, NP, H}
VT = {she, he, it, they, them, dogs, run, like, play, plays, likes, runs}
and with a production set consisting of the following rules:
P = { S she VPs, S he VPs, S it VPs, S dogs VPp, S they VPp,
VPs likes NP, VPs plays NP, VPs plays H, VPs runs H,
VPp like NP, VPp play NP, VPp play H, VPp run H,
NP it H, NP them H,
H l }
The state label set of the equivalent automaton is then just the nonterminal vocabulary VN of the grammar, and the vocabulary of input symbols is just the vocabulary VT of terminal symbols, that is
Q = {S, VPs, VPp, NP, H}
T = {she, he, it, they, them, dogs, run, like, play, plays, likes, runs}
The set of accepting states contains the single label H from the nonterminal vocabulary of the grammar, and the starting state of the equivalent automaton is labelled with the starting symbol of the grammar so that
H = {H}
q0 = S
The transition function of the automaton then consists of the the following elements:
M = { S she VPs, S he VPs, S it VPs, S dogs VPp, S they VPp,
VPs likes NP, VPs plays NP, VPs plays H, VPs runs H,
VPp like NP, VPp play NP, VPp play H, VPp run H,
NP it H, NP them H }

Transition Network. The transition funtion of a finite-state automaton specifies a transition network. Such networks can be represented graphically as is illustrated by the diagram included here which corresponds to the transition function shown in the foregoing example.

Finite-State Transition Network

In transition network diagrams, the states of the machine are represented with circles that are labelled with symbols from the set Q of states. The accepting states are usually distinguished from the others by being depicted with double concentric circles. The state circles in the diagram are called the nodes of the network. In the case of an automaton that is equivalent to a finite grammar, as is the case for the transition network illustrated here, the nodes are labelled with the nonterminal symbols of the grammar.

Nodes are joined by arrows called arcs that correspond to the transitions between the states represented by the nodes. The arrowhead on the arc identifies the direction of the transition to which the arc corresponds. Arcs are labelled by the input symbols that are read during the corresponding transition. In a transition network diagram for an automaton that is equivalent to a finite grammar, the arcs are labelled with the words from the terminal vocabulary of the grammar. As the accompanying diagram shows, arcs may have multiple labels when two or more different words can be read during a transition between the same pair of states. For example, the arc between the starting state S and the VPs node is labelled with the three words 'she,' 'he,' and 'it.' Also, the same word can label more than one arc, as is illustrated by the fact the both the transition between the S and VPs nodes and between the NP and H nodes are labelled with the word 'it.'

Accepting Paths. As an automaton processes a string, it can be described as traversing a path in its transition network. Such paths can consist of a single transition between a pair of states along an arc labelled by a word that is read. Paths can also consist of two or more transitions. A path that begins in the starting state and ends in an accepting state, with all of the words of the input string having been read, is called an accepting path. Note, however, that the machine is actually not moving along the arcs of the transition network diagram; it is simply shifting between its internal states that are represented by the nodes in the diagram.

Nondeterministic Automata. A finite-state automaton is said to be deterministic if its transition function is single-valued, that is, for each current state and input symbol pair, there is a unique new state. An automaton is said to be nondeterministic if its transition function is multi-valued, that is, there are at least two transitions with the same current state and input symbol pair, but with different new states. The finite-state machine for which the transition network is illustrated here is a nondeterministic automaton because, with the machine in the VPs state and with the input word 'plays,' the following two transitions to the different states NP and H are available:

VPs plays NP
VPs plays H
For any nondeterministic automaton A, an equivalent deterministic automaton A' can in principle be constructed. Thus, if A is equivalent to some finite grammar G, then the new automaton A' is also equivalent to G and will accept just the sentences generated by G. Construction of A', however, entails the introduction of new states not included in the state label vocabulary Q of the automaton A. Hence, the correspondence between the nonterminal vocabulary VN of G and the state label vocabulary Q of the nondeterministic automaton A is lost.

In practice, where preservation of the corresondence between VN and Q is important, it is customary to continue to work with the nondeterministic automaton, rather than with an equivalent deterministic machine. The operation of the automaton is then modified to permit it to emulate nondeterministic processing by incorporating the capability to backtrack. This capability prevents the machine from becoming blocked, and failing to accept a legitimate string, because it has executed one of the alternative transitions that happens to result in its rejecting a string which would have been accepted if one of the other alternatives had been executed.

Linguistics 484 Notes Index Top of Page