% Linguistics 484 -- Example: SRA Prolog % % Shift/Reduce Automaton. % % The predicates comprising this program cause the Prolog interpreter % to emulate the operations of a Shift/Reduce Automaton that includes % a predictive capability. % % This predictive or "oracular" capability imposes constraints on the % categories that can be accepted given the words of the input string % that have already been accepted by the automaton. % % The transitions executed by this automaton form the basis of the % operations performed by the LR acceptors. 'LR' in this context stands % for 'Left-to-right, Rightmost' where 'Left-to-right' means that the % input string is processed by the automaton from left to right (as % opposed to right to left). 'Rightmost' means that the right-hand sides % of rewriting rules are matched, and applied, to the input string, and % to subsequent sentential forms, during the acceptance or recognition % process. In other words, the recognition procedure works bottom-up % (as opposed to top-down) so that an LR acceptor performs a bottom-up % recognition. % % Unlike the LR acceptors, which are deterministic, the automaton % illustrated here is nondeterministic; that is, it has at least one % state out of which there are two or more transitions into different % states during which the same symbol is processed. Note that the % symbols processed by this machine include nonterminals as well as % terminal symbols. These issues, and the relation of this automaton % to the LR acceptors, are discussed further below. % % As was the case with the earlier example, the machine demonstrated % here has been adapted to display output consisting of the sequence % of its moves as it accepts an input string. Thus, typing % the following query: % % s([nice, dogs, like, cats]) . % % will cause output something like the following to be displayed: % % Action % <[], [], [nice,dogs,like,cats]> % Shift : <[a --> . nice], [nice], [dogs,like,cats]> % : <[a --> nice .], [nice], [dogs,like,cats]> % Reduce: <[], [a], [dogs,like,cats]> % : <[np --> . a n], [a], [dogs,like,cats]> % : <[np --> a . n], [a], [dogs,like,cats]> % Shift : <[n --> . dogs], [a,dogs], [like,cats]> % : <[n --> dogs .], [a,dogs], [like,cats]> % Reduce: <[np --> a . n], [a,n], [like,cats]> % : <[np --> a n .], [a,n], [like,cats]> % Reduce: <[], [np], [like,cats]> % : <[s --> . np vp], [np], [like,cats]> % : <[s --> np . vp], [np], [like,cats]> % : <[vp --> . v np], [np], [like,cats]> % Shift : <[v --> . like], [np,like], [cats]> % : <[v --> like .], [np,like], [cats]> % Reduce: <[vp --> . v np], [np,v], [cats]> % : <[vp --> v . np], [np,v], [cats]> % : <[np --> . n], [np,v], [cats]> % Shift : <[n --> . cats], [np,v,cats], []> % : <[n --> cats .], [np,v,cats], []> % Reduce: <[np --> . n], [np,v,n], []> % : <[np --> n .], [np,v,n], []> % Reduce: <[vp --> v . np], [np,v,np], []> % : <[vp --> v np .], [np,v,np], []> % Reduce: <[s --> np . vp], [np,vp], []> % : <[s --> np vp .], [np,vp], []> % Reduce: <[], [s], []> % % although extra spaces that do not appear in the actual output have been % inserted here to enhance readability and facilitate the description % below. % % As in the previous example, the first column of the foregoing table % identifies the action, either a shift or a reduce operation, executed % by the automaton. In this example, however, the automaton performs % other operations, in addition to the shift and reduce, that are not % identified by name in the table. These operations are described below. % % In both this and the previous example, the last column shows what % remains of the input string as the automaton executes each of its % operations in the course of the recognition process. % % The next column to the left shows the contents of the stack. This % stack provides temporary storage for words and syntactic category labels % as they are identified during the recognition. In this example, the % stack is identified, following convention, as the "Parse Stack" % (notwithstanding the automaton does not yield a phrase marker for the % input strings it accepts). % % The qualification "parse" is added to distinguish this stack from a % second stack, called the "Control Stack," with which this version of a % shift/reduce automaton is equipped. The existence of this stack, and % its purpose, mark the principal difference between the previous % automaton and the one illustrated in this example. % % The second column of the table shows the topmost item in the control stack % at each stage of the recognition process. In its starting state, the % control stack of the machine is empty, as indicated by the empty list % shown in the first line of the table. Thereafter, the items stored in % the control stack consist of lists the elements of which are based on % the rewriting rules that can be, or are applied during each transition % of the automaton. % % The progress of application of a rule is indicated by a dot that % appears before, between, or after the right-hand side symbols of a % rule. A dot to the left of all the right-hand side symbols means % that the rule has been identified as potentially applicable; but the % application process has not yet begun. A dot to the right of all % the right-hand side symbols means that the right-hand side symbols % can be matched so that the rule can be applied in a reduce operation. % A dot between the right-hand side symbols means that the symbols to % its left can be matched, but the symbols to the right of the dot % have not yet been encountered. The symbols to the right of the dot, % however, constitute a prediction of the symbols that are expected % during the next steps of the recognition. % % The item at the top of the control stack determines the next % transition of the machine. Thus, the different configurations % of the control stack correspond to the states of the automaton. % For example, during the initial transition of the machine, % the first word of the input string is shifted onto the parse % stack. During this transition, an item represented above as % [a --> . nice] is pushed onto the control stack. The dot % indicates that the word has been tentatively identified as an % adjective. The rule A --> nice is therefore potentially applicable % so that, during the next transition, the right-hand side is % expected to match the topmost element of the parse stack. % % During the next transition, the matching processed is performed. % Because the right-hand side does match the word on top of the parse % stack, the dot is moved to the right of the word as illustrated % by the item [a --> nice .] at the top of the control stack % as shown in the third row of the table. % % Since the expected match is successful, a reduce operation can be % performed. This operation is executed during the next transition. % The word 'nice' is popped from the parse stack, and the left-hand % side of the rule is pushed onto the parse stack. The item % [a --> nice .] is popped from the top of the control stack, % leaving it empty, as is illustrated in the fourth row of the % foregoing empty. % % When its control stack is empty, the automaton executes a % transition during which it selects a rule with a first right-hand % side symbol that is expected to match the symbol on top of the parse % stack, and pushes a corresponding item onto its control stack. % The result of such a transition is shown in the fifth row of the % table wherein the item [np --> . a n] appears in the control % stack. Note that, as was described in connection with the % earlier transitions, the selection of a potentially applicable % rule constitutes a separate operation from from the actual % application of the rule. Hence, in this case, the dot appears % to the left of the symbol 'a' that is expected to match the % topmost symbol of the parse stack. % % The actual match of the symbol 'a' is performed in the next % transition. Since, as expected, the match is successful, the % dot is moved over the 'a' so that it appears between the % 'a' and the 'n' as shown in the item [np --> a . n] on top % of the control stack in the sixth row of the foregoing table. % % The item [np --> a . n] on the control stack predicts % that the next word of the input string will be a noun. % Therefore, during its next transition, the automaton performs % a shift operation during which the word 'dogs' is pushed % on top of the symbol 'a' in the parse stack, and the item % [n --> . dogs] is pushed onto the control stack. Note that, % although the seventh row of the table shows only this new in % the control stack, the item [np --> a . n] previously on top % of the control stack is below the item [n --> . dogs] and % hence does not appear in the table. % % The symbol 'dogs' to the right of the dot is confirmed to % match the word on top of the parse stack during the next % transition so that dot is moved to the right leaving the % item [n --> dogs .] on top of the control stack as shown % in the eighth row of the table. % % A reduce operation can now be performed that leaves the % symbol 'n' on top of the parse stack as illustrated by the % ninth row of the table. This row also shows that the item % [n --> dogs .] from the control stack so that the item below % it, [np --> a . n] , again appears at the top. % % Since the symbol 'n' to the right of the dot in the item % [np --> a . n] matches the symbol now on top of the parse % stack, the dot is shifted to the right during the next % transition so that the item [np --> a n .] appears on top % of the control stack. % % Both right-hand side symbols of the rule NP --> A N have % at this stage been found to match symbols in the parse % stack. A reduce operation can consequently be performed % during the next transition. This operation results in the % symbol 'np' being on top of the parse stack, and since % the item [np --> a n .] has been popped, the control % stack is now empty. % % Since the control stack is empty at this stage, the automaton % selects a rule the first right-hand side of which potentially % matches the symbol presently on top of the parse stack. % Consequently, the item [s --> . np vp] is pushed onto the % control stack. The 'np' is confirmed to yield a match during % the following transition so that the dot is moved to the right. % Hence, the item [s --> np . vp] appears in the control stack. % % At this stage, however, the symbol 'vp' to the right of the % dot does not match the topmost symbol of the parse stack. % Therefore, the automaton selects a new rule the left-hand side % of which matches the 'vp' . The rule selected corresponds to % the item [vp --> . v np] that is pushed on top of the item % [s --> np . vp] in the control stack. % % The new item [vp --> . v np] predicts that the next word % of the input string will be in the category 'v' , which % leads to the shift operation whereby the word 'like' is % pushed onto the parse stack, and the item [v --> . likes] is % pushed onto the control stack. % % Note that, at this stage, the control stack contains three % items. The item [v --> . likes] is on top of the item % [vp --> . v np] , which is in turn on top of the item % [s --> np . vp] . Following the reduce operation, during % which the symbol 'v' is pushed on top of 'np' in the % parse stack, the item [v --> . likes] is popped from the % control stack so that the item [vp --> . v np] again % appears on top. % % Since the 'v' to the right of the dot in the item % [vp --> . v np] matches the symbol on top of the parse stack, % the dot is moved so that it appears to the left of the 'np' . % The item [vp --> v . np] is therefore now on top of the % control stack. This 'np' , however, cannot be matched with % the topmost symbol 'v' in the parse stack. Consequently, a % rule with the left-hand side symbol 'np' is selected. % % The output illustrated in the foregoing table shows that the % rule NP --> N has been selected so that the corresponding item % [np --> . n] is pushed onto the control stack. Note, however, % that the rule NP --> A N might have been selected instead. % In this case, the automaton will undergo a transition into a state % wherein the item [np --> . a n] is on top of the control stack. % Thus, in the state defined by the control stack with the item % [vp --> v . np] as its topmost element, the automaton has two % different transitions available to it. Of these two possibilities, % the transition that will lead to acceptance of the input string % cannot be decided until the next word is encountered. % % The decision in such cases could be made if the automaton were % able to look ahead. In this instance, looking ahead one word % would be sufficient. The automaton illustrated here, however, % does not have this capability. The problem is resolved in this % example by including the NP --> N rule before the NP --> A N % rule so that the appropriate one is selected first. Also, because % the automaton is implemented in Prolog, even if the inappropriate % rule were selected first, the Interpreter would backup and try % the alternative rule when the one originally selected failed to % yield acceptance. % % In another solution to the rule selection problem, both potentially % applicable rules would be identified, and the automaton would % perform the two resulting transitions in parallel. Only when one % of these parallel paths fails to yield acceptance of the input % string would it be abandoned, and the successful path is pursued. In % practice, however, parallel paths are actually pursued serially, in % succession, by storing alternative states of the automaton so that % they are available at each stage of the acceptance process. % % Although it might not appear as a solution to the rule selection % problem, one response to this problem is to consider only grammars % which eliminate the possibility that two are more rules are potentially % applicable at some stage of the acceptance process. An automaton % equivalent to one of these grammars would be deterministic. Such grammars % generate subsets of the context-free languages. Hence, there are % context-free languages that cannot be generated by these grammars, % nor accepted by the deterministic automata to which these grammars % are equivalent. % % A sentence such as that used as an example here would not be accepted % by an automaton that is equivalent to one of these grammars. The % automaton illustrated here, however, is not deterministic. Its % nondeterministic operation is accommodated by the backtracking % facility of the Prolog Interpreter. In this case, it happens that % the item [np --> . n] is pushed onto the control stack, and the % resulting transitions lead to acceptance of the word 'cats' on the % basis of the rule N --> cats corresponding to the item % [n --> . cats] , and then [n --> cats .] , on top of the control % stack, so that backtracking is not required. % % With acceptance of the final word, the list representing the input % string is empty. The remainder of the transitions of the automaton, % consisting of a sequence of reduce operations (and the related % operations associated with symbol matching that are represented by % moving the dot over the matched symbols) correspond to the % transitions already described above. During these transitions, % symbols on the parse stack are reduced, and items previously pushed % onto the control stack are popped. The last of these transitions % leaves the control stack empty, and the sentence symbol 's' on the % parse stack. This configuration of the automaton, with all symbols % of the input string having been consumed, signals acceptance of the % string as a sentence. %s/1 s(String) : The list of data tokens in String comprises a % sentence if it is accepted by a shift/reduce % automaton the transitions of which are emulated % by the move/4 predicate. % % The start/2 predicate is invoked to start the % the recognition process by shifting the first % word of the input string onto the parse stack % and by pushing onto the control stack the lexical % rule that identifies the category of the word. % (Note that an empty control stack symbol is also % stored beneath the lexical rule.) % % The finish/2 is invoked to display the configurations % of the machine during the recognition. The list_moves/1 % predicate is invoked for this purpose, after some % titles are displayed. % s( String ) :- start( String, Moves ), finish( String, Moves ) . %start/2 : Start the recognition process by shifting the first word % of the input string onto the parse stack, and pushing the % lexical category "dotted" rule, Cat --> . Word , onto the % control stack of the automaton. Then, invoke the move/4 % predicate to perform the subsequent steps of the recognition. % start( [Word | String], [['Shift ', [Cat, [Word], []], [Word], String] | Moves] ) :- word( Cat, [Word] ), move( String, [Word], [[Cat, [Word], []], [e]], Moves ) . %finish/2 : The string has been accepted; so display a table of the % configurations of the machine during the acceptance process % by invoking the list_moves/1 predicate. % finish( String, Moves ) :- nl, write('Action '), nl, write(' <[], [], '), write(String), write('>'), nl, list_moves(Moves). % move/4 : Controls the transitions of the shift/reduce automaton. % % move(String, Stack, Control, Moves) : String is a list of the % words comprising the input string; % Stack is a list representing the push-down store of the % automaton; Control is a list representing the control % stack of the automaton; and, Moves is a list the elements % of which trace the acceptance path of the string. % % The Moves list includes the action performed by the automaton % (such as a shift or a reduce operation), the item ("dotted" % rule) at the top of the Control stack, the contents of the % parse stack, and the remainder of the input string to be % recognised. Note that the information stored at Moves % serves only to trace the state and operations of the % machine. % % The first clause tests whether or not the input string % has been consumed, the parse stack contains just the root % or starting symbol of the grammar, and the control stack % contains just the empty stack symbol. If this test % succeeds, the string is accepted. % move( [], [s], [[e]], [] ) . % If the first as yet unmatched right-hand label of the "dotted" % rule at the top of the control stack coincides with the category % label of the word presently at the head of the input string, % shift the word onto the parse stack, and push the lexical % category rule onto the control stack. move( [Word | String], Stack, [[LHS, [Cat | Cats], RHSs] | Control], [['Shift ', [Cat, [Word], []], [Word | Stack], String] | Moves] ) :- word( Cat, [Word] ), move( String, [Word | Stack], [[Cat, [Word], []], [LHS, [Cat | Cats], RHSs] | Control], Moves ) . % All of the right-hand side labels of the "dotted" rule on top of the % control have been matched by labels that have been pushed onto the % parse stack; so, perform a reduce operation, and pop the satisfied % rule off of the control stack. move( String, Stack_Before, [[LHS, [], RHS], Next | Control], [['Reduce', Next, Stack_After, String] | Moves] ) :- reduce( LHS, RHS, Stack_Before, Stack_After), move( String, Stack_After, [Next | Control], Moves ) . % A previously unmatched right-hand side label of a rule at the top % of the control stack is found to match the label on top of the % parse stack; so, move the dot in the rule over the matching label, % and replace the rule on the control stack with the updated rule. move( String, [Cat | Stack], [[LHS, [Cat | Cats], RHSs] | Control], [[' ', [LHS, Cats, [Cat | RHSs]], [Cat | Stack], String] | Moves] ) :- move( String, [Cat | Stack], [[LHS, Cats, [Cat | RHSs]] | Control], Moves ) . % Push onto the control stack a rule the left-hand side of which % coincides with leading unmatched label on the right-hand side % of the rule presently on top of the control stack. The dot in % this new rule is to the left of all the right-hand side labels. % This operation represents the predictive or oracular function % of the shift-reduce acceptor. move( String, Stack, [[LHS, [Cat | Cats], RHSs] | Control], [[' ', [Cat, RHS, []], Stack, String] | Moves] ) :- rule( Cat, RHS ), move( String, Stack, [[Cat, RHS, []], [LHS, [Cat | Cats], RHSs] | Control], Moves ) . % If the control stack is empty at this stage of the recognition, % push onto the control stack a rule the first right-hand side label % of which matches the label on top of the parse stack. The dot in % this newly-introduced rule is to the left of all its right-hand % side labels. move( String, [Cat | Stack], [[e]], [[' ', [LHS, [Cat | Cats], []], [Cat | Stack], String] | Moves] ) :- rule( LHS, [Cat | Cats] ), move( String, [Cat | Stack], [[LHS, [Cat | Cats], []], [e]], Moves ) . % reduce/4 : Determines whether or not the last right-hand side % symbol of a rule matches the symbol presently at % the top of the stack. If there is a match, the % pop/3 predicate is proved to perform the pop % operation. % % reduce(LHS, RHS, Stack_In, Stack_Out) : Stack_In is the stack % before the reduce operation, and % Stack_Out is the stack afterward. % % (Note that the match of the stack top % with the last right-hand side symbol of a rule % is necessary because symbols are stored in the % stack in reverse order.) reduce( LHS, RHS, Stack, [LHS | Stack_Out] ) :- pop( RHS, Stack, Stack_Out ). % pop/3 % % pop( RHS, Stack_In, Stack_Out ) : Given a rule with right- % hand side RHS and stack contents in Stack_In, performs % a reduce operation by matching the right-hand side % symbols of the rule with the symbols on the top % of the stack, and popping them from the stack, with % the resulting stack contents returned in Stack_Out. % pop( [], Stack, Stack ) . pop( [Last | RHS], [Last | Stack], Stack_Out ) :- pop( RHS, Stack, Stack_Out ) . %list_moves/1 % % list_moves( M ) : M is a list of the moves executed by the % shift-reduce acceptor during an analysis. % The show_move/2 predicate is invoked to % display the topmost item in the Control % stack, the contents of the parse stack, % and the remaining items of the input % at each transition. list_moves( [] ). list_moves( [[Action, Control, Stack, Input] | Moves] ) :- write( Action ), write( ': ' ), show_move( Control, Stack, Input ), list_moves( Moves ) . %show_move/2 % % show_move( Control, Stack, Input ) : Displays the item, a % "dotted" rule, at the top % of the Control stack, together with the contents of % the "parse" Stack and the remaining segment of the % input string, with these three entities represented % as an ordered triple. % % (Note that, the order of the symbols in the stack is % reversed by the reverse/2, reverse/3 predicates. % The store is emulated by a Prolog list, which means % symbols are added, or pushed onto the left-end hand, % which therefore acts as the top of the store. % For the display produced here it is required that the % top of the store be the right-hand end of the list.) % % The show_rule/1 predicate is invoked to display the % "dotted" rule at the top of the Control stack. show_move( Control, Stack, Input ) :- reverse( Stack, Store ), write( '<' ), show_rule( Control ), write( ', ' ), write( Store ), write( ', ' ), write( Input ), write( '>' ), nl. %show_rule/1 show_rule( [e] ) :- write('[]') . show_rule( [LHS, RHStoFind, RHSfound] ) :- reverse( RHSfound, FoundRHS ), write('['), write(LHS), write(' -->'), show_list( FoundRHS ), write(' .'), show_list( RHStoFind ), write(']') . %show_list/1 show_list( [] ) . show_list( [Item | Rest] ) :- write(' '), write( Item ), show_list( Rest ) . % reverse/2, reverse/3 % % Predicates to reverse the order of the elements % of a list. (Used here in displaying the contents % of the parse stack, and the category labels to % the right of the dotted in "dotted" rules.) reverse(AB, BA) :- reverse(AB, BA, []). reverse([], U, U). reverse([H | AB], U, W) :- reverse(AB, U, [H | W]). % rule/2 % % rule( LHS, RHS ) : LHS is the left-hand side label % of a rule with the list % of right-hand side labels RHS. % rule( s, [np, vp] ) . rule( np, [n] ) . rule( np, [a, n] ) . rule( vp, [v, np] ) . % word/2 % % rule( Cat, [Word] ) : Cat is the category label % of the lexical item Word. % word( a, [nice] ) . word( n, [dogs] ) . word( v, [like] ) . word( n, [cats] ) .