% Linguistics 484 -- RTN Prolog % % Recursive Transition Network: RTN Acceptor. % % This example consists of predicates that cause the Prolog interpreter to % emulate the working of a RECURSIVE TRANSITION NETWORK (RTN) Acceptor. % Such systems are also called Transition Network Grammars (TNGs). % % An RTN or TNG represents a transcription of the rules of a context-free % grammar into the arcs of a network. Sentences generated by the grammar % are accepted by the RTN by traversing the network comprised of these arcs. % % RTNs form the core or basic component of the Augmented Transitions % Networks (ATNs). An ATN is an RTN with additional resources that permit % it to process the complex attributes or features of lexical and syntactic % categories. Thus, for example, an ATN can perform subject/verb number % agreement. % % RTNs resemble Finite-State and Push-Down Automata (FSAs and PDAs) in % several respects. In the first place, like FSAs and PDAs, RTNs are % recognisers (acceptors) of sentences generated by phrase-structure % grammars. In fact, RTNs are equivalent to PDAs in that they are capable % of accepting sentences generated by Chomsky Type 2 (context-free) grammars. % RTNs also resemble FSAs and PDAs in that their workings may be represented % employing transition networks. RTNs actually developed out of the concept % of the transition network associated with an FSA (and hence, their name). % % Beyond their pedigree, however, RTNs share little else with FSAs and differ % substantively from FSA transition networks. The most obvious of the % differences between the two classes of machine lies in the labelling of the % states (nodes) and arcs of their transition networks. While the states of % the transition network of an FSA are labelled by nonterminal symbols of the % grammar to which the automaton is equivalent, and the arcs are labelled by % the terminals, the arcs of an RTN are labelled by the nonterminals of the % equivalent grammar, and the nodes (states) are labelled arbitrarily, with % the labelling employed solely to distinguish among them. Thus, there is % normally no connection between the nodes of an RTN and the grammar to which % it corresponds (is equivalent). % % Note that, although the foregoing observations cite FSA transition networks, % these statements apply equally to PDA networks, with the exception that % in the case of the latter machines, arcs corresponding to lambda/pop % transitions are not labelled by terminal symbols of the equivalent grammar. % % Note also that, notwithstanding the statements above regarding the % labelling of RTN arcs only with nonterminal symbols, some special arcs can % be treated as if they were labelled with terminals. In some treatments of % RTNs, those arcs involved directly in the processing (reading and accepting) % of terminal symbols can be labelled with the terminal symbols. This issue % will be discussed below. % % Another significant difference between RTNs and the transition networks of % FSAs and PDAs is that, while the networks of the latter machines consist of % one complete network, the network of the former can be treated as a % collection of subnetworks. There is one subnetwork for each distinct left- % hand side symbol in the rules of the production set of the grammar to which % the RTN is equivalent. For example, if the grammar includes the rewrite % rules S --> NP VP, NP --> DET N, and VP --> V NP, then the RTN will % consist of at least three subnetworks, one each for the symbols S, NP, and % VP. In some treatments of RTNs, there will also be subnetworks for the % preterminal, lexical category symbols DET, N, and V which appear on the % left-hand sides of the lexical rules of the grammar. % % Any arc of a subnetwork can be traversed only if there exists a path in the % subnetwork the name of which is the label of the arc. For example, % traversal of an NP-arc in an S-subnetwork with a path corresponding to the % rule S --> NP VP requires that a path of the NP-subnetwork be traversed. % Details of the processes involved in the testing of arcs and the traversal % of paths in the subnetworks comprising an RTN are included in the comments % accompanying the individual predicates below. % %s/1 % % s( L ) : L is a list of symbols accepted as a sentence by the Recursive % Transition Network (RTN) illustrated in this example. % % The s/1 predicate consists of one assertion, a rule the body of % which contains the goal push/3. This goal tests the assertion % that the entire list of data symbols Data is accepted, as a % consequence of the traversal of a path of the "s"-subnetwork, % leaving the null list; that is, there exists a path of the % "s"-subnetwork during the traversal of which all the symbols % in the list Data are consumed. % s( Data ) :- push( s, Data, [] ) . %sub_net/4 % % sub_net( Net, Node, L, L1 ) : L1 is the list of data symbols left over from % the list L after its initial elements have been % accepted during the traversal of an arc of the % subnetwork Net from the node (state) Node. % % The sub_net/4 predicate employs recursion to effect a traversal % of the subnetwork identified in its first argument. With each % recursive iteration, one arc of the subnetwork is traversed. % The starting node (state) of this arc is identified in the % second argument. The sequence of data symbols not yet accepted % is passed to sub_net/4 as its third argument. The symbols % remaining, after those accepted during the traversal of the % current arc have been removed, are passed back to the calling % predicate in the fourth argument. % % The sub_net/4 predicate consists of two assertions, the first % of which is a fact stating that the subnetwork Net has been % traversed to its final state, and that no further data symbols % are consumed. The final state of the subnetwork is labelled % "9". The sequence of data symbols passed as its third % argument are returned to the calling predicate as the fourth % argument of sub_net/4. % % The second predicate of sub_net/4 is a recursive rule the body % of which consists of three goals, the last of which is a test % of sub_net/4 itself. The first goal is a reference is the arc/4 % predicate to which is passed the name of the subnetwork and the % label of a node, and from which is returned the label of the % next node in the subnetwork and the label of the arc joining the % two nodes. The arc label is a symbol in the nonterminal % vocabulary of the context-free grammar to which the RTN % corresponds. The push/3 predicate is then tested with the arc % label as its first argument. In effect, push/3 is employed to % effect a test that the arc label corresponds to a nonterminal % symbol in the derivation of the sequence of data symbols. The % third argument of push/3 returns the data symbols remaining % after the initial symbols of the list in its second argument % have been consumed. The remaining symbols are passed to % sub_net/4 which is called to effect traversal of the next arc % in the current subnetwork. The data symbols left following % this, and subsequent traversals are returned to be passed back % for testing with other subnetworks. % sub_net( Net, 9, Data, Data ) . sub_net( Net, Node1, Data, Data2 ) :- arc( Net, Node1, Label, Node2 ), push( Label, Data, Data1 ), sub_net( Net, Node2, Data1, Data2) . %push/3 % % push( Label, L, L1 ) : L1 is the list of data symbols left over from the list % L following a successful push (acceptance) of the % symbol Label. % % The push/3 predicate employs two assertions to perform two % separate operations which are known, in the language of % Transition Network Grammars, as "pushing" and "catting". % % The "catting" operation, which is effected by the first of % the two assertions, consists of testing that a data symbol is % in the lexical category expected as this stage of the acceptance % procedure. The term "expected" is appropriate in this context % because an RTN works top-down in an hypothesis-driven fashion. % The test is performed by means of a reference to cat/3 % in the body of the first assertion. The test succeeds if the % current data symbol Word (the symbol now at the head of the % list following consumption of any previous symbols) is in the % category passed as the first argument of push/3. Thus, the % assertions of the cat/3 predicate correspond to entries in the % RTN's lexicon. Note that each of these entries may be viewed % as a Finite State Automaton with a transition network consisting % of only one arc. In each case, the arc is labelled with the % corresponding data symbol (word) and the initial state or node % is labelled with the (lexical) category. Label of the final % state (node) of the arc is arbitrary and, in fact, unnecessary. % % The second assertion of push/3 performs the RTN "push" % operation, consisting of testing that there exists a subnetwork % corresponding to the arc label (nonterminal symbol) passed as % the first argument of push/3. (Recall that push/3 is called % during traversal of an arc of another subnetwork. In fact, % traversal of this arc is dependent upon successful traversal % of the subnetwork called in the body of the second assertion of % the push/3 predicate.) The term "push" derives from the % equivalence of an RTN to a Push-Down Automaton. The nonterminal % symbol labelling the arc of the subnetwork being traversed when % push/3 is called is considered to be pushed onto a stack. The % real, physical stack involved is actually that employed to save % current values during recursive calls. In the language of RTNs, % if the arc label is, for example, the symbol NP, then the call % to push/3 is described as "pushing for an NP". (Recall that % the act of storing a symbol in a stack or push-down store is % is described as "pushing" the symbol onto the top of the store.) % The "pushed" symbol is "popped", that is, recovered from the % store on return to the calling predicate. Thus, if the "push % for an NP" succeeds, then the NP is considered to be "popped" % on return to the calling predicate. % push( Cat, [ Word | Data1], Data1 ) :- cat( Cat, Word, 0 ) . push( Net, Data, Data1 ) :- sub_net( Net, Net, Data, Data1 ) . %arc/4 % % arc( Net, Node1, Label, Node2 ) : Label is the label of an arc from Node1 to % Node2 in the subnetwork Net. % % Each assertion of arc/4 represents one arc of a subnetwork. % The particular subnetwork to which an arc belongs is determined % by the value in the first argument position of the assertion. % For example, all assertions with the string "s" in the first % argument correspond to arcs in the "s"-subnetwork. % % Each path (sequence of arcs) from the initial node to the final % node of a subnetwork corresponds to a rewrite rule in a context- % free phrase structure grammar. Each subnetwork corresponds to % the collection of those rules in the grammar with the same left- % hand side symbol. There is one subnetwork for each different % left-hand side symbol in the grammar. For example, the "s"- % subnetwork represents the collection of those rules with the % left-hand side symbol S, and the "vp"- and "np"-subnetworks % represent the collections of rules with VP and NP left-hand % sides, respectively. % % The node (state) labels of subnetworks are arbitrary. In this % example, the initial or starting state is labelled with the % left-hand side symbol of the rewrite rules to which the % subnetwork corresponds. Thus, the initial state label and the % subnetwork identifier or name are the same. Numbers are used % to identify all other nodes, and "9" is used to identify the % final node (state) of all subnetworks. % % The arcs of subnetworks are labelled with the nonterminals % comprising the right-hand side symbols of the rewrite rules to % which the subnetwork corresponds. The sequence of labels of % the arcs comprising each distinct path through a subnetwork % correspond to the string of right-hand side symbols in one of % the rewrite rules. For example, the path consisting of the % two arcs and <2, vp, 9> of the "s"-subnetwork % correspond to the rewrite rule S --> NP VP . These arcs are % encoded in the assertions of the arc/4 predicate as % arc(s, s, np, 2) and arc(s, 2, vp, 9), respectively. % Paths in a subnetwork can be written as ordered n-tuples of % arcs. Hence, the other two paths in the "s"-subnetwork may be % written as <, <2, vp, 9>>, and % <, <1, np, 2>, <2, vp, 9>>. These correspond % to the rewrite rules S --> WH VP and S --> AUX NP VP, % respectively. The "s"-subnetwork illustrated here can be % depicted as follows: % % NP % |---------------------->| % | | % | AUX NP | VP % |S|-------->|1|-------->|2|-------->|9| % | | % | WH | % |---------------------->| % % arc( s, s, np, 2 ) . arc( s, s, wh, 2 ) . arc( s, s, aux, 1 ) . arc( s, 1, np, 2 ) . arc( s, 2, vp, 9 ) . % % N % |------------------------>| % | | % | DET N | % |NP|--------->|1|--------->|9| % | | % | A | % |----------->| arc( np, np, a, 1 ) . arc( np, np, det, 1 ) . arc( np, 1, n , 9 ) . arc( np, np, n, 9 ) . % % V % |------------------------>| % | | % | V NP | % |VP|--------->|1|--------->|9| % | | % | AUX VP | % |---------->|2|---------->| arc( vp, vp, v, 1 ) . arc( vp, vp, v, 9 ) . arc( vp, vp, aux, 2 ) . arc( vp, 1, np, 9 ) . arc( vp, 2, vp, 9 ) . %cat/3 % % cat( Cat, Word, End ) : Word is a data symbol in the (lexical) category Cat. % cat( det, the, 0) . cat( det, some, 0) . cat( a, nice, 0) . cat( n, cat, 0) . cat( n, cats, 0) . cat( n, dog, 0) . cat( n, dogs, 0) . cat( v, like, 0) . cat( v, likes, 0) . cat( v, liked, 0) . cat( v, chase, 0) . cat( v, chases,0) . cat( v, chased,0) . cat( aux, do, 0) . cat( aux, does, 0) . cat( aux, did, 0) . cat( wh, who, 0) . cat( wh, what, 0) .