![]() | Ling484 Notes |
A. C. Brett acbrett@uvic.ca
Department of Linguistics University of Victoria Clearihue C139 |
This note describes the architecture and operation of a push-down automaton, and demonstrates the equivalence of this machine to context-free grammar. The following topics are discussed:
Push-Down Store. The stack or push-down store of a push-down automaton can be treated as a list. Symbols can be added to or removed from this list, however, only from one end. This end of the list is called the top of the store. When a symbol is added to the store, it is said to be pushed onto the top of the store. When a symbol is removed, it is said to be popped from the store. A symbol within the store can popped only after all the symbols above it have been popped. Thus, the last symbol pushed onto the store is always the first symbol to be popped from it. Hence, a push-down store is sometimes called a last in/first out memory.
The symbols stored in the push-down store of a push-down automaton are the labels of its states. Thus, the machine, in effect, uses its store to record the labels of certain of the states it has experienced in the course of its processing an input string. Because the store is a last in/first out memory, the labels of the states experienced earlier in the processing of the string are beneath those of the more recently experienced states which are closer to the top of the store.
Although the store of a push-down automaton has a top, it is not considered to have a bottom. A push-down store can, in principle, record infinitely many items. The end of the list of stored items, however, is usually marked with a special symbol called the empty store symbol because when the store is empty, this special symbol is at the top. Thus, the empty store symbol is added to the state label vocabulary of the automaton to make up its store vocabulary or push-down vocabulary.
Ordered Sextuple Specification. A Push-Down Automaton A can be specified by the entries in the following ordered sextuple:
A = á Z, Q, T, M, q0, H ñwherein
Transition Function. The transition function, M, of the automaton, which is also called the transition matrix or transition table of the machine, can be represented as a set of ordered quintuples with the following form:
á zi, qi, w, zj, qj ñwhere
zi qi w ® zj qjUnlike a finite-state automaton, most implementations of which execute only one variety of transition, and during its transitions, the machine always reads an input symbol, a push-down automaton can perform several different types of transition. A push-down automaton can execute transitions during which is does read an input symbol. In other types of transition, however, it does not perform a read operation. Such transitions are usually called lambda transitions, or l-transitions, because the machine, in effect, reads the null string. A push-down automaton may also execute transitions during which it pushes one or more symbols onto its store, or it pops one or more symbols. The machine might also perform moves during which it does not change the contents of its store.
The particular variety of push-down automaton discussed here performs three different types of move. These transitions are as follows:
z0 q0 w1 ® z0 q1where q0 is the initial state of the machine, q1 is the state into which it shifts while reading the first word w1 of the input string, and z0 is the empty store symbol, which appears at the topmost of the store because the store is empty and this condition does not change during the move.
zi qi wj ® qi qjwhere the machine moves out of the state qi into the state qj while reading the input symbol wj and pushing the label qi of its old state onto the top of its store to cover the previous topmost symbol zi in the store.
zi qi l ® zj qjwhere zj, the topmost symbol in the store with the machine in its new state qj, is the symbol that was immediately below the topmost symbol zi in the store with the automaton in its old state qi.
Configurations. The configuration or instantantaneous configuration of a push-down automaton at any stage in its processing of an input string is determined by its current state and the contents of its store. For example, if the machine is in the state qj and, in addition to the empty store symbol z0, its push-down store contains symbols z1z2 ... zk, with zk being the topmost, then the configuration of the machine is specified by the string
z0z1z2 ... zk qjIn general, if the store contains the string z Î Z* of symbols on top of the empty store symbol while the machine is in the state qi, then its configuration can be represented as
z0z qiThe initial configuration of a push-down automaton is then represented by the string
z0 q0where q0 is the initial state of the machine.
Reductions. As is the case with finite-state automata, the processing of a string of symbols by a push-down automaton can be treated as something like the derivation of a sentence by applying the rewriting rules of a phrase-structure grammar. Such derivation processes, however, are called reductions because, rather than beginning with the starting symbol of the grammar and producing a sentence, a reduction starts with an input string which is consumed by the automaton. Reductions can nonetheless be treated as string rewriting processes, and are begun by concatenating an input string s = w1w2 ... wn Î T* to the initial configuration z0q0 of the automaton. This concatenation yields the string
z0q0 w1w2 ... wnThe first move executed by the automaton is an initial or finite-state transition during which the machine shifts from its initial state q0 as it reads the first word w1 of the input string. The effect of this transition can be represented by applying the rewriting rule form of the initial or finite-state transition
z0 q0 w1 ® z0 q1to the foregoing string. This process can then be represented by writing
z0q0 w1w2 ... wn Þ z0q1w2 ... wnwhere Þ can be read as "reduces in one transition to." As a consequence of this transition, the automaton is in a new state q1, and its store still contains only the empty store symbol z0. Thus, the new configuration of the machine corresponds to the string z0q1. Since the first word w1 has been read, the input tape now contains the string w2 ... wn.
Subsequent steps in the reduction are performed by read/push and lambda/pop transitions. For example, application of a rewriting rule with the form
z0 q1 w2 ® q1 q2corresponding to a read/push transition yields a reduction that can be represented as
z0q1 w2 ... wn Þ z0q1 q2w3 ... wnDuring this transition, the word w2 is read, so that the input tape now contains the string w3 ... wn, and the machine shifts into the new state q2. The label of its previous state, q1, is pushed onto the store so that it contains the two symbols z0 q1.
The next move can be a lambda/pop transition corresponding to a rule of the form
q1 q2 l ® z0 q3whereby the automaton does not read an input symbol, but pops the symbol q1 from its store while shifting from the state q2 into the new state q3. The result can be represented by the following reduction:
z0q1q2 w3 ... wn Þ z0q3 w3 ... wnFurther read/push and lambda/pop transitions might be executed until all the input symbols have been read and the automaton stops in the following configuration:
z0zqhwhere z Î Z* denotes a string of symbols above the empty store symbol z0 in the push-down store with the machine in the state qh. The reduction to this configuration can be represented by writing
z0q0 w1w2 ... wn Þ* z0zqhwhere Þ* is read as "reduces in finitely many transitions to."
Accepting Conditions. A push-down automaton that has read all the symbols of an input string s Î T* and halts in the configuration
z0zqhis said to accept s if either one or the other of the following conditions is met:
Language of the Machine. The language L(A) accepted by the push-down automaton A may be defined as follows:
L(A) = {s | s Î T*, z0q0s Þ* z0zqh, qhÎ H, or z = l}
Equivalence to Context-Free Grammars. Push-Down automata are said to be equivalent to context-free (Chomsky Type 2) grammars, which means that, given an arbitrary context-free grammar G', a push-down automaton A can be divised that accepts just the sentences generated by G'. In other words,
L(A) = L(G')where L(A) is the language of the automaton A, and L(G') is the language generated by G'.
The construction of an equivalent push-down automaton proceeds as follows. In the first place, recall that the rewriting rules of a context-free grammar have the general form
A ® awhere A denotes a nonterminal, while a represents a string that can consist of both terminal and non-terminal symbols. Recall further that, given an arbitrary context-free grammar G', an equivalent Chomsky Normal Form grammar G can be devised such that all its rules have one or the other of the following forms
B ® bwhere B, C, and D are non-terminals while b is a terminal symbol. If
B ® C D
G = á VN, VT, S, P ñthen a push-down automaton A where
A = á Z, Q, T, M, q0, H ñthat is equivalent to G can be specified as follows:
Consequently, if s is a sentence generated by G, that is s Î L(G), then for every rule in P applied to generate s, there is a transition in M that can be applied to accept s. Hence, s Î L(A) so that L(G) Í L(A). Conversely, if s Î L(A), a sequence of rules can be found in P that matches the sequence of transitions in M that are executed to accept s. Hence, s Î L(G) so that L(A) Í L(G). Consequently, we have that
L(A) = L(G)
Example. The accompanying diagram shows an accepting path in the transition network of a push-down automaton equivalent to a context-free grammar that generates the sentence
some dogs like the catThe store, state, and tape vocabularies of the automaton therefore include the following sets of symbols:
The '$' sign labels the initial state of the machine, and the '#' sign is its empty store symbol. The set of accepting states H of the automaton consists of the starting symbol S of the grammar. The transition function of the machine consists of the following elements, where the individual transitions are listed, with the rewriting rules of the grammar to which they correspond, in the order in which the automaton executes them to accept the foregoing sentence:
{DET, N, V, NP, VP, S, #} Í Z {DET, N, V, NP, VP, S, $} Í Q {the, some, cat, dogs, like} Í T
Transition :: Rewriting Rule # $ some ® # DET :: DET ® some # DET dogs ® DET N :: N ® dogs DET N l ® # NP :: NP ® DET N # NP like ® NP V :: V ® like NP V the ® V DET :: DET ® the V DET cat ® DET N :: N ® cat DET N l ® V NP :: NP ® DET N V NP l ® NP VP :: VP ® V NP NP VP l ® # S :: S ® NP VP
The reduction representing the configurations of the automaton and the contents of its input tape as the machine accepts the sentence is shown in the following table with the machine's state and the contents of its store and tape in separate columns for clarity:
Store State Tape # $ some dogs like the cat Þ # DET dogs like the cat Þ # DET N like the cat Þ # NP like the cat Þ # NP V the cat Þ # NP V DET cat Þ # NP V DET N Þ # NP V NP Þ # NP VP Þ # S
Acceptance Process. The automaton begins the process of accepting the string
some dogs like the catby executing an initial, finite-state transition during which it reads the first word 'some' while shifting out of its initial state, labelled '$', into a state labelled DET, thereby identifying the word as a determiner. In its next move, the automaton performs a read/push transition to read the next word, 'dogs,' as it shifts into the state labelled N to identify this word as a noun. During this transition, the machine records the label DET of its previous state by pushing it onto the store. This operation is indicated in the diagram by the forward slant or slash '/' character followed by the label DET on the arc between the DET and N nodes. Note that this notation is not conventional; normally only the word read during a read/push transition appears on the arc. The notation used in the accompanying diagram, however, helps in describing the operations of the automaton as it accepts a string.
The next move performed by the automaton is a lambda/pop transition during which the machine shifts into the state labelled NP while popping the label DET from its store. This operation is indicated by the backward slant or slash '\' character preceding the symbol that is popped. By recovering this symbol from its memory, the machine is recalling the label of a previous state. In this case, it recalls that the word that precedes the noun it has just read is a determiner, and hence, is able to identify the two words as comprising a noun phrase.
During the next move, which is a read/push transition, the automaton reads the word 'like' while moving into the state labelled V and pushing the label NP onto its store in order to record that it has previously identified a noun phrase. The move immediately following is another read/push transition during which the automaton reads the word 'the' as it shifts into the DET state and pushes the label V onto its store. At this point, therefore, the machine has the two state labels, NP and V, in its store on top of the empty store symbol.
The machine pushes a third label onto its store, namely DET, as it executes another read/push transition to read the word 'cat' as it shifts into the N state. The machine then executes a lambda/pop transition as it moves into the NP state while popping the DET label, thereby identifying the last two words, 'the cat,' as a noun phrase.
The next move is another lambda/pop transition during which the automaton pops the V label from its store as it shifts to the VP state. In this process, the machine recalls the category of the word 'like' which it read three moves earlier. The last transition brings the machine into the accepting state labelled S as the automaton empties its store by popping the label NP which was originally recorded after the first two words, 'some dogs,' of the string were read. Since the input tape contains no more words, and both accepting conditions are actually satisfied, the five words are accepted as a sentence in the language of the automaton.
Linguistics 484 | Notes Index | Top of Page |