Ling484 Notes |
A. C. Brett acbrett@uvic.ca
Department of Linguistics University of Victoria Clearihue C139 |
This note describes the architecture and operation of a push-down automaton, and demonstrates the equivalence of this machine to context-free grammar. The following topics are discussed:
Push-Down Store. The stack or push-down store of a push-down automaton can be treated as a list. Symbols can be added to or removed from this list, however, only from one end. This end of the list is called the top of the store. When a symbol is added to the store, it is said to be pushed onto the top of the store. When a symbol is removed, it is said to be popped from the store. A symbol within the store can popped only after all the symbols above it have been popped. Thus, the last symbol pushed onto the store is always the first symbol to be popped from it. Hence, a push-down store is sometimes called a last in/first out memory.
The symbols stored in the push-down store of a push-down automaton are the labels of its states. Thus, the machine, in effect, uses its store to record the labels of certain of the states it has experienced in the course of its processing an input string. Because the store is a last in/first out memory, the labels of the states experienced earlier in the processing of the string are beneath those of the more recently experienced states which are closer to the top of the store.
Although the store of a push-down automaton has a top, it is not considered to have a bottom. A push-down store can, in principle, record infinitely many items. The end of the list of stored items, however, is usually marked with a special symbol called the empty store symbol because when the store is empty, this special symbol is at the top. Thus, the empty store symbol is added to the state label vocabulary of the automaton to make up its store vocabulary or push-down vocabulary.
Ordered Sextuple Specification. A Push-Down Automaton A can be specified by the entries in the following ordered sextuple:
A = á Z, Q, T, M, q_{0}, H ñwherein
Transition Function. The transition function, M, of the automaton, which is also called the transition matrix or transition table of the machine, can be represented as a set of ordered quintuples with the following form:
á z_{i}, q_{i}, w, z_{j}, q_{j} ñwhere
z_{i} q_{i} w ® z_{j} q_{j}Unlike a finite-state automaton, most implementations of which execute only one variety of transition, and during its transitions, the machine always reads an input symbol, a push-down automaton can perform several different types of transition. A push-down automaton can execute transitions during which is does read an input symbol. In other types of transition, however, it does not perform a read operation. Such transitions are usually called lambda transitions, or l-transitions, because the machine, in effect, reads the null string. A push-down automaton may also execute transitions during which it pushes one or more symbols onto its store, or it pops one or more symbols. The machine might also perform moves during which it does not change the contents of its store.
The particular variety of push-down automaton discussed here performs three different types of move. These transitions are as follows:
z_{0} q_{0} w_{1} ® z_{0} q_{1}where q_{0} is the initial state of the machine, q_{1} is the state into which it shifts while reading the first word w_{1} of the input string, and z_{0} is the empty store symbol, which appears at the topmost of the store because the store is empty and this condition does not change during the move.
z_{i} q_{i} w_{j} ® q_{i} q_{j}where the machine moves out of the state q_{i} into the state q_{j} while reading the input symbol w_{j} and pushing the label q_{i} of its old state onto the top of its store to cover the previous topmost symbol z_{i} in the store.
z_{i} q_{i} l ® z_{j} q_{j}where z_{j}, the topmost symbol in the store with the machine in its new state q_{j}, is the symbol that was immediately below the topmost symbol z_{i} in the store with the automaton in its old state q_{i}.
Configurations. The configuration or instantantaneous configuration of a push-down automaton at any stage in its processing of an input string is determined by its current state and the contents of its store. For example, if the machine is in the state q_{j} and, in addition to the empty store symbol z_{0}, its push-down store contains symbols z_{1}z_{2} ... z_{k}, with z_{k} being the topmost, then the configuration of the machine is specified by the string
z_{0}z_{1}z_{2} ... z_{k} q_{j}In general, if the store contains the string z Î Z^{*} of symbols on top of the empty store symbol while the machine is in the state q_{i}, then its configuration can be represented as
z_{0}z q_{i}The initial configuration of a push-down automaton is then represented by the string
z_{0} q_{0}where q_{0} is the initial state of the machine.
Reductions. As is the case with finite-state automata, the processing of a string of symbols by a push-down automaton can be treated as something like the derivation of a sentence by applying the rewriting rules of a phrase-structure grammar. Such derivation processes, however, are called reductions because, rather than beginning with the starting symbol of the grammar and producing a sentence, a reduction starts with an input string which is consumed by the automaton. Reductions can nonetheless be treated as string rewriting processes, and are begun by concatenating an input string s = w_{1}w_{2} ... w_{n} Î T^{*} to the initial configuration z_{0}q_{0} of the automaton. This concatenation yields the string
z_{0}q_{0} w_{1}w_{2} ... w_{n}The first move executed by the automaton is an initial or finite-state transition during which the machine shifts from its initial state q_{0} as it reads the first word w_{1} of the input string. The effect of this transition can be represented by applying the rewriting rule form of the initial or finite-state transition
z_{0} q_{0} w_{1} ® z_{0} q_{1}to the foregoing string. This process can then be represented by writing
z_{0}q_{0} w_{1}w_{2} ... w_{n} Þ z_{0}q_{1}w_{2} ... w_{n}where Þ can be read as "reduces in one transition to." As a consequence of this transition, the automaton is in a new state q_{1}, and its store still contains only the empty store symbol z_{0}. Thus, the new configuration of the machine corresponds to the string z_{0}q_{1}. Since the first word w_{1} has been read, the input tape now contains the string w_{2} ... w_{n}.
Subsequent steps in the reduction are performed by read/push and lambda/pop transitions. For example, application of a rewriting rule with the form
z_{0} q_{1} w_{2} ® q_{1} q_{2}corresponding to a read/push transition yields a reduction that can be represented as
z_{0}q_{1} w_{2} ... w_{n} Þ z_{0}q_{1} q_{2}w_{3} ... w_{n}During this transition, the word w_{2} is read, so that the input tape now contains the string w_{3} ... w_{n}, and the machine shifts into the new state q_{2}. The label of its previous state, q_{1}, is pushed onto the store so that it contains the two symbols z_{0} q_{1}.
The next move can be a lambda/pop transition corresponding to a rule of the form
q_{1} q_{2} l ® z_{0} q_{3}whereby the automaton does not read an input symbol, but pops the symbol q_{1} from its store while shifting from the state q_{2} into the new state q_{3}. The result can be represented by the following reduction:
z_{0}q_{1}q_{2} w_{3} ... w_{n} Þ z_{0}q_{3} w_{3} ... w_{n}Further read/push and lambda/pop transitions might be executed until all the input symbols have been read and the automaton stops in the following configuration:
z_{0}zq_{h}where z Î Z^{*} denotes a string of symbols above the empty store symbol z_{0} in the push-down store with the machine in the state q_{h}. The reduction to this configuration can be represented by writing
z_{0}q_{0} w_{1}w_{2} ... w_{n} Þ^{*} z_{0}zq_{h}where Þ^{*} is read as "reduces in finitely many transitions to."
Accepting Conditions. A push-down automaton that has read all the symbols of an input string s Î T^{*} and halts in the configuration
z_{0}zq_{h}is said to accept s if either one or the other of the following conditions is met:
Language of the Machine. The language L(A) accepted by the push-down automaton A may be defined as follows:
L(A) = {s | s Î T^{*}, z_{0}q_{0}s Þ^{*} z_{0}zq_{h}, q_{h}Î H, or z = l}
Equivalence to Context-Free Grammars. Push-Down automata are said to be equivalent to context-free (Chomsky Type 2) grammars, which means that, given an arbitrary context-free grammar G', a push-down automaton A can be divised that accepts just the sentences generated by G'. In other words,
L(A) = L(G')where L(A) is the language of the automaton A, and L(G') is the language generated by G'.
The construction of an equivalent push-down automaton proceeds as follows. In the first place, recall that the rewriting rules of a context-free grammar have the general form
A ® awhere A denotes a nonterminal, while a represents a string that can consist of both terminal and non-terminal symbols. Recall further that, given an arbitrary context-free grammar G', an equivalent Chomsky Normal Form grammar G can be devised such that all its rules have one or the other of the following forms
B ® bwhere B, C, and D are non-terminals while b is a terminal symbol. If
B ® C D
G = á V_{N}, V_{T}, S, P ñthen a push-down automaton A where
A = á Z, Q, T, M, q_{0}, H ñthat is equivalent to G can be specified as follows:
Consequently, if s is a sentence generated by G, that is s Î L(G), then for every rule in P applied to generate s, there is a transition in M that can be applied to accept s. Hence, s Î L(A) so that L(G) Í L(A). Conversely, if s Î L(A), a sequence of rules can be found in P that matches the sequence of transitions in M that are executed to accept s. Hence, s Î L(G) so that L(A) Í L(G). Consequently, we have that
L(A) = L(G)
Example. The accompanying diagram shows an accepting path in the transition network of a push-down automaton equivalent to a context-free grammar that generates the sentence
some dogs like the catThe store, state, and tape vocabularies of the automaton therefore include the following sets of symbols:
The '$' sign labels the initial state of the machine, and the '#' sign is its empty store symbol. The set of accepting states H of the automaton consists of the starting symbol S of the grammar. The transition function of the machine consists of the following elements, where the individual transitions are listed, with the rewriting rules of the grammar to which they correspond, in the order in which the automaton executes them to accept the foregoing sentence:
{DET, N, V, NP, VP, S, #} Í Z {DET, N, V, NP, VP, S, $} Í Q {the, some, cat, dogs, like} Í T
Transition :: Rewriting Rule # $ some ® # DET :: DET ® some # DET dogs ® DET N :: N ® dogs DET N l ® # NP :: NP ® DET N # NP like ® NP V :: V ® like NP V the ® V DET :: DET ® the V DET cat ® DET N :: N ® cat DET N l ® V NP :: NP ® DET N V NP l ® NP VP :: VP ® V NP NP VP l ® # S :: S ® NP VP
The reduction representing the configurations of the automaton and the contents of its input tape as the machine accepts the sentence is shown in the following table with the machine's state and the contents of its store and tape in separate columns for clarity:
Store State Tape # $ some dogs like the cat Þ # DET dogs like the cat Þ # DET N like the cat Þ # NP like the cat Þ # NP V the cat Þ # NP V DET cat Þ # NP V DET N Þ # NP V NP Þ # NP VP Þ # S
Acceptance Process. The automaton begins the process of accepting the string
some dogs like the catby executing an initial, finite-state transition during which it reads the first word 'some' while shifting out of its initial state, labelled '$', into a state labelled DET, thereby identifying the word as a determiner. In its next move, the automaton performs a read/push transition to read the next word, 'dogs,' as it shifts into the state labelled N to identify this word as a noun. During this transition, the machine records the label DET of its previous state by pushing it onto the store. This operation is indicated in the diagram by the forward slant or slash '/' character followed by the label DET on the arc between the DET and N nodes. Note that this notation is not conventional; normally only the word read during a read/push transition appears on the arc. The notation used in the accompanying diagram, however, helps in describing the operations of the automaton as it accepts a string.
The next move performed by the automaton is a lambda/pop transition during which the machine shifts into the state labelled NP while popping the label DET from its store. This operation is indicated by the backward slant or slash '\' character preceding the symbol that is popped. By recovering this symbol from its memory, the machine is recalling the label of a previous state. In this case, it recalls that the word that precedes the noun it has just read is a determiner, and hence, is able to identify the two words as comprising a noun phrase.
During the next move, which is a read/push transition, the automaton reads the word 'like' while moving into the state labelled V and pushing the label NP onto its store in order to record that it has previously identified a noun phrase. The move immediately following is another read/push transition during which the automaton reads the word 'the' as it shifts into the DET state and pushes the label V onto its store. At this point, therefore, the machine has the two state labels, NP and V, in its store on top of the empty store symbol.
The machine pushes a third label onto its store, namely DET, as it executes another read/push transition to read the word 'cat' as it shifts into the N state. The machine then executes a lambda/pop transition as it moves into the NP state while popping the DET label, thereby identifying the last two words, 'the cat,' as a noun phrase.
The next move is another lambda/pop transition during which the automaton pops the V label from its store as it shifts to the VP state. In this process, the machine recalls the category of the word 'like' which it read three moves earlier. The last transition brings the machine into the accepting state labelled S as the automaton empties its store by popping the label NP which was originally recorded after the first two words, 'some dogs,' of the string were read. Since the input tape contains no more words, and both accepting conditions are actually satisfied, the five words are accepted as a sentence in the language of the automaton.
Linguistics 484 | Notes Index | Top of Page |