Linguistics 484 - Grammars Ling484 Notes A. C. Brett acbrett@uvic.ca
Department of Linguistics
University of Victoria
Clearihue C139
Last updated: 18 March 2001

Push-Down Automata

A Push-Down Automaton is finite-state machine that it is equipped with a memory device that functions as a push-down store. Push-down automata are equivalent to context-free grammars, also known as the Chomsky Type 2 grammars, which means that, given a context-free grammar G, a push-down automaton A can be devised that recognises just the sentences generated by G. This relationship between context-free grammars and push-down automata was first described by Chomsky (1962), although a machine closely related to a push-down automaton was employed earlier by Yngve (1960) to model the transient memory required by the human processor to analyse sentences with a variety of different structures generated by context-free grammars. Yngve's work continues to be cited and the essentials of his method are still applied in studies of short-term memory and sentence processing (see Murata et al. (2001), for example) with the store of a push-down automaton serving as a model of transient memory employed to analyse sentences with different structures.

This note describes the architecture and operation of a push-down automaton, and demonstrates the equivalence of this machine to context-free grammar. The following topics are discussed:

A Push-Down Automaton is usually described as consisting of four components: The nature and operations of the control, read, and tape units are the same as those of a finite-state automaton, except that the transitions executed by the control unit of a push-down automaton involve operations to store symbols in, and retrieve symbols from its memory unit. The memory unit of a push-down automaton operates as a stack or push-down store, which leads to the name for this variety of abstract machine.

Push-Down Store. The stack or push-down store of a push-down automaton can be treated as a list. Symbols can be added to or removed from this list, however, only from one end. This end of the list is called the top of the store. When a symbol is added to the store, it is said to be pushed onto the top of the store. When a symbol is removed, it is said to be popped from the store. A symbol within the store can popped only after all the symbols above it have been popped. Thus, the last symbol pushed onto the store is always the first symbol to be popped from it. Hence, a push-down store is sometimes called a last in/first out memory.

The symbols stored in the push-down store of a push-down automaton are the labels of its states. Thus, the machine, in effect, uses its store to record the labels of certain of the states it has experienced in the course of its processing an input string. Because the store is a last in/first out memory, the labels of the states experienced earlier in the processing of the string are beneath those of the more recently experienced states which are closer to the top of the store.

Although the store of a push-down automaton has a top, it is not considered to have a bottom. A push-down store can, in principle, record infinitely many items. The end of the list of stored items, however, is usually marked with a special symbol called the empty store symbol because when the store is empty, this special symbol is at the top. Thus, the empty store symbol is added to the state label vocabulary of the automaton to make up its store vocabulary or push-down vocabulary.

Ordered Sextuple Specification. A Push-Down Automaton A can be specified by the entries in the following ordered sextuple:

A = Z, Q, T, M, q0, H
wherein

Transition Function. The transition function, M, of the automaton, which is also called the transition matrix or transition table of the machine, can be represented as a set of ordered quintuples with the following form:

zi, qi, w, zj, qj
where Elements of the transition function can also be represented in a rewriting rule format such as
zi qi w zj qj
Unlike a finite-state automaton, most implementations of which execute only one variety of transition, and during its transitions, the machine always reads an input symbol, a push-down automaton can perform several different types of transition. A push-down automaton can execute transitions during which is does read an input symbol. In other types of transition, however, it does not perform a read operation. Such transitions are usually called lambda transitions, or l-transitions, because the machine, in effect, reads the null string. A push-down automaton may also execute transitions during which it pushes one or more symbols onto its store, or it pops one or more symbols. The machine might also perform moves during which it does not change the contents of its store.

The particular variety of push-down automaton discussed here performs three different types of move. These transitions are as follows:

Configurations. The configuration or instantantaneous configuration of a push-down automaton at any stage in its processing of an input string is determined by its current state and the contents of its store. For example, if the machine is in the state qj and, in addition to the empty store symbol z0, its push-down store contains symbols z1z2 ... zk, with zk being the topmost, then the configuration of the machine is specified by the string

z0z1z2 ... zk qj
In general, if the store contains the string z Z* of symbols on top of the empty store symbol while the machine is in the state qi, then its configuration can be represented as
z0z qi
The initial configuration of a push-down automaton is then represented by the string
z0 q0
where q0 is the initial state of the machine.

Reductions. As is the case with finite-state automata, the processing of a string of symbols by a push-down automaton can be treated as something like the derivation of a sentence by applying the rewriting rules of a phrase-structure grammar. Such derivation processes, however, are called reductions because, rather than beginning with the starting symbol of the grammar and producing a sentence, a reduction starts with an input string which is consumed by the automaton. Reductions can nonetheless be treated as string rewriting processes, and are begun by concatenating an input string s = w1w2 ... wn T* to the initial configuration z0q0 of the automaton. This concatenation yields the string

z0q0 w1w2 ... wn
The first move executed by the automaton is an initial or finite-state transition during which the machine shifts from its initial state q0 as it reads the first word w1 of the input string. The effect of this transition can be represented by applying the rewriting rule form of the initial or finite-state transition
z0 q0 w1 z0 q1
to the foregoing string. This process can then be represented by writing
z0q0 w1w2 ... wn z0q1w2 ... wn
where can be read as "reduces in one transition to." As a consequence of this transition, the automaton is in a new state q1, and its store still contains only the empty store symbol z0. Thus, the new configuration of the machine corresponds to the string z0q1. Since the first word w1 has been read, the input tape now contains the string w2 ... wn.

Subsequent steps in the reduction are performed by read/push and lambda/pop transitions. For example, application of a rewriting rule with the form

z0 q1 w2 q1 q2
corresponding to a read/push transition yields a reduction that can be represented as
z0q1 w2 ... wn z0q1 q2w3 ... wn
During this transition, the word w2 is read, so that the input tape now contains the string w3 ... wn, and the machine shifts into the new state q2. The label of its previous state, q1, is pushed onto the store so that it contains the two symbols z0 q1.

The next move can be a lambda/pop transition corresponding to a rule of the form

q1 q2 l z0 q3
whereby the automaton does not read an input symbol, but pops the symbol q1 from its store while shifting from the state q2 into the new state q3. The result can be represented by the following reduction:
z0q1q2 w3 ... wn z0q3 w3 ... wn
Further read/push and lambda/pop transitions might be executed until all the input symbols have been read and the automaton stops in the following configuration:
z0zqh
where z Z* denotes a string of symbols above the empty store symbol z0 in the push-down store with the machine in the state qh. The reduction to this configuration can be represented by writing
z0q0 w1w2 ... wn * z0zqh
where * is read as "reduces in finitely many transitions to."

Accepting Conditions. A push-down automaton that has read all the symbols of an input string s T* and halts in the configuration

z0zqh
is said to accept s if either one or the other of the following conditions is met:

Language of the Machine. The language L(A) accepted by the push-down automaton A may be defined as follows:

L(A) = {s | s T*, z0q0s * z0zqh, qh H, or z = l}

Equivalence to Context-Free Grammars. Push-Down automata are said to be equivalent to context-free (Chomsky Type 2) grammars, which means that, given an arbitrary context-free grammar G', a push-down automaton A can be divised that accepts just the sentences generated by G'. In other words,

L(A) = L(G')
where L(A) is the language of the automaton A, and L(G') is the language generated by G'.

The construction of an equivalent push-down automaton proceeds as follows. In the first place, recall that the rewriting rules of a context-free grammar have the general form

A a
where A denotes a nonterminal, while a represents a string that can consist of both terminal and non-terminal symbols. Recall further that, given an arbitrary context-free grammar G', an equivalent Chomsky Normal Form grammar G can be devised such that all its rules have one or the other of the following forms
B b
B C D
where B, C, and D are non-terminals while b is a terminal symbol. If
G = VN, VT, S, P
then a push-down automaton A where
A = Z, Q, T, M, q0, H
that is equivalent to G can be specified as follows: and with the transition function M of A consisting of elements corresponding to members of the production set P of G such that Thus, the terminal symbols of the grammar G become the input symbols of the automaton A, while the nonterminal symbols of G become the state labels and store symbols ofA, but with the addition of the special initial state label q0 and empty store symbol z0. The starting symbol S of G labels an accepting state of A (unlike the situation with the finite-state automaton where S labels the initial state). The rewriting rules of the form B C D in the grammar become the lambda/pop transitions of the automaton, with the B b rules becoming initial and read/push transitions.

Consequently, if s is a sentence generated by G, that is s L(G), then for every rule in P applied to generate s, there is a transition in M that can be applied to accept s. Hence, s L(A) so that L(G) L(A). Conversely, if s L(A), a sequence of rules can be found in P that matches the sequence of transitions in M that are executed to accept s. Hence, s L(G) so that L(A) L(G). Consequently, we have that

L(A) = L(G)

Example. The accompanying diagram shows an accepting path in the transition network of a push-down automaton equivalent to a context-free grammar that generates the sentence

some dogs like the cat
The store, state, and tape vocabularies of the automaton therefore include the following sets of symbols:
{DET, N, V, NP, VP, S, #} Z
{DET, N, V, NP, VP, S, $} Q
{the, some, cat, dogs, like} T
The '$' sign labels the initial state of the machine, and the '#' sign is its empty store symbol. The set of accepting states H of the automaton consists of the starting symbol S of the grammar. The transition function of the machine consists of the following elements, where the individual transitions are listed, with the rewriting rules of the grammar to which they correspond, in the order in which the automaton executes them to accept the foregoing sentence:
Transition :: Rewriting Rule
# $ some # DET :: DET some
# DET dogs DET N :: N dogs
DET N l # NP :: NP DET N
# NP like NP V :: V like
NP V the V DET :: DET the
V DET cat DET N :: N cat
DET N l V NP :: NP DET N
V NP l NP VP :: VP V NP
NP VP l # S :: S NP VP

The reduction representing the configurations of the automaton and the contents of its input tape as the machine accepts the sentence is shown in the following table with the machine's state and the contents of its store and tape in separate columns for clarity:

Store State Tape
# $ some dogs like the cat
# DET dogs like the cat
# DET N like the cat
# NP like the cat
# NP V the cat
# NP V DET cat
# NP V DET N
# NP V NP
# NP VP
# S

Acceptance Process. The automaton begins the process of accepting the string

some dogs like the cat
by executing an initial, finite-state transition during which it reads the first word 'some' while shifting out of its initial state, labelled '$', into a state labelled DET, thereby identifying the word as a determiner. In its next move, the automaton performs a read/push transition to read the next word, 'dogs,' as it shifts into the state labelled N to identify this word as a noun. During this transition, the machine records the label DET of its previous state by pushing it onto the store. This operation is indicated in the diagram by the forward slant or slash '/' character followed by the label DET on the arc between the DET and N nodes. Note that this notation is not conventional; normally only the word read during a read/push transition appears on the arc. The notation used in the accompanying diagram, however, helps in describing the operations of the automaton as it accepts a string.

Push-Down Automaton Transition Network

The next move performed by the automaton is a lambda/pop transition during which the machine shifts into the state labelled NP while popping the label DET from its store. This operation is indicated by the backward slant or slash '\' character preceding the symbol that is popped. By recovering this symbol from its memory, the machine is recalling the label of a previous state. In this case, it recalls that the word that precedes the noun it has just read is a determiner, and hence, is able to identify the two words as comprising a noun phrase.

During the next move, which is a read/push transition, the automaton reads the word 'like' while moving into the state labelled V and pushing the label NP onto its store in order to record that it has previously identified a noun phrase. The move immediately following is another read/push transition during which the automaton reads the word 'the' as it shifts into the DET state and pushes the label V onto its store. At this point, therefore, the machine has the two state labels, NP and V, in its store on top of the empty store symbol.

The machine pushes a third label onto its store, namely DET, as it executes another read/push transition to read the word 'cat' as it shifts into the N state. The machine then executes a lambda/pop transition as it moves into the NP state while popping the DET label, thereby identifying the last two words, 'the cat,' as a noun phrase.

The next move is another lambda/pop transition during which the automaton pops the V label from its store as it shifts to the VP state. In this process, the machine recalls the category of the word 'like' which it read three moves earlier. The last transition brings the machine into the accepting state labelled S as the automaton empties its store by popping the label NP which was originally recorded after the first two words, 'some dogs,' of the string were read. Since the input tape contains no more words, and both accepting conditions are actually satisfied, the five words are accepted as a sentence in the language of the automaton.

Linguistics 484 Notes Index Top of Page