% Linguistics 484 -- Example: BU_DFR.TXT % % Production Systems: A Bottom-Up, Depth-First Recogniser. % % This program illustrates a PRODUCTION SYSTEM that works DEPTH-FIRST to % recognise or accept a sentence, rather than BREADTH-FIRST as does the % Production System Interpreter demonstrated in the previous example. % % Both examples illustrate RECOGNISERS, that is, procedures which may also % may be described as ACCEPTORS, or as DECISION PROCEDURES, because they % accept a string only if it is a sentence in a specified language, with the % language being specified by a PRODUCTION SET, a collection of rewrite rules. % They print a message indicating whether or not the string is a sentence. % Note that this program is not a PARSER. A PARSER is a RECOGNISER % which, in addition to determining whether or not a string is a % sentence, produces a structure for the string if it is a sentence. % % A PRODUCTION SYSTEM normally is described as being comprised of % the following three components: % % 1) a DATABASE; % 2) a PRODUCTION SET; and, % 3) an INTERPRETER. % % In these examples, the DATABASE, in its INITIAL STATE, contains a % string (sentence?); the PRODUCTION SET is a collection of context- % free rewrite rules defining the grammar of a language; and, the % program itself is the INTERPRETER. The program applies elements % (rules) from the PRODUCTION SET to the string in the DATABASE. The % initial string in the DATABASE (the INITIAL STATE of the DATABASE) % is recognised as a sentence, in the language defined by grammar % (PRODUCTION SET) if, after no further PRODUCTIONS (rules) can be % applied, and the INTERPRETER HALTS, the DATABASE contains only the % root symbol (axiom) of the grammar. The DATABASE is said to be in its % ACCEPTING STATE. % % In the case of the BREADTH-FIRST ACCEPTOR illustrated previously, the % INTERPRETER applies applicable productions to the DATABASE string so that, % given a string such as "the cat ate the rat", it derives the following % sequence of sentential forms: % % det cat ate the rat % det n ate the rat % det n v the rat % det n v det rat % det n v det n % np v det n % np v np % np vp % s % % The DEPTH-FIRST ACCEPTOR illustrated in this example derives the following % sequence of sentential forms: % % det cat ate the rat % det n ate the rat % np ate the rat % np v the rat % np v det rat % np v det n % np v np % np vp % s % % (Note that the BREADTH-FIRST RECOGNISER displays only a subset of the % sentential forms in the derivation of "s" from "the cat ate the rat".) % % The present DEPTH-FIRST RECOGNISER is obtained from the previous BREADTH- % FIRST RECOGNISER by making minor modifications to one of the assertions % of the test_prod/2 predicate. These changes are noted in comments % accompanying the predicate. There are no other changes to any of the other % predicates comprising the INTERPRETER. % %s/1 % % s( L ) : The list of symbols L comprises a sentence (licensed by the % productions represented in the assertions of the prod/2 % predicate). % % The initial contents of the Production System's DATABASE is % the string of symbols in the argument of the s/1 predicate. % s( Data ) :- test_data( Data, Steps ), show_steps( Steps ) . %test_data/2 % % test_data( L, F ) : L is a sentential form (represented as a list of % symbols) provided to test_data/2 at the step in the % recognition at which test_data/2 is called. F is the % sequence of sentential forms (represented as a list of % lists) derived by the successive application of % productions and returned by test_data/2. % % test_data( [s], [] ) . test_data( Data, [Result | Steps] ) :- test_prod( Data, Result ), Data \= Result, test_data( Result, Steps ) . %test_prod/2 % % test_prod( T, F ) : F is a sentential form derived from the sentential % form T. % % In both the previous BREADTH-FIRST RECOGNISER and in the % present DEPTH-FIRST RECOGNISER, the second assertion of the % test_prod/2 predicate tests a symbol in the DATABASE against % the first right-hand side symbols of the productions in the % PRODUCTION SET. If the symbols match, the apply_prod/3 % predicate is invoked to test the remaining right-hand side % symbols of the candidate production. If these symbols match % successive sentential form symbols, then the production is, % in effect, applied. If no match is obtained with the % sentential form symbols tested, then the third assertion of % test_prod/2 is employed to move one symbol to the right in % the DATABASE. If the end of the sentential form is encountered, % the first assertion of test_prod/2 succeeds. % % In the case of the BREADTH-FIRST RECOGNISER, if a production % can be applied, then the test_prod/2 is invoked recursively % to test the next symbols of the sentential form in the DATABASE % following those replaced after successful application of a % production. These, the symbols remaining in the DATABASE to the % right of those replaced by application of a production, are % returned in the third argument of apply_prod/3 and passed in % the first argument of test_prod/2 when it is called. Note % that these two goals, the call to apply_prod/3 and the % recursive call to test_prod/2 have been "commented-out" in % this example of a DEPTH-FIRST RECOGNISER. % % In this case, an example of a DEPTH-FIRST RECOGNISER, the % call to apply_prod/3, followed by a call to test_prod/2, has % been replaced by a call to apply_prod/3 which returns those % symbols of the sentential form to the right of those replaced % by the application of a single production. The test_prod/2 % predicate is not called to apply further productions to these % symbols. Instead, the symbols are returned, prefixed with % the left-hand side of the production that was applied, to the % predicate that called test_prod/2. If the symbols replaced % are the first in the sentential form, then test_prod/2 was % called by test_data/2 and the return is made to test_data/2. % If the symbols replaced by application of a production are % not the first, because no production was applicable to the % initial symbols, then test_prod/2 was called recursively in % the third assertion of test_prod/2. Returning through % test_prod/2 in its third assertion causes the initial symbols % of the sentential which could not be replaced to be prefixed % to those to their right, which include the left-hand side % symbol of the production that was applied in the second % assertion. Ultimately, the return is made to test_data/2. % % The purpose of the test_data/2 predicate, in addition to % testing if the DATABASE has been reduced to the root symbol, % is to, in effect, "reset" the DATABASE so that, when % test_prod/2 is called, the search for symbols that can be % replaced by application of a production begins again with the % first symbol. In the case of the BREADTH-FIRST RECOGNISER, % the "resetting" effected by test_data/2 is performed only % after all symbols that can be replaced have been replaced in % current form of the DATABASE. On the other hand, in the case % of the DEPTH-FIRST RECOGNISER, the "resetting" is effected % after only one production has been applied. % test_prod( [], [] ) . test_prod( [Symbol | Data_i], [LHS | Result] ) :- prod( LHS, [Symbol | Symbols] ), apply_prod( Symbols, Data_i, Result ) . % depth-first % apply_prod( Symbols, Data_i, Data_j ), % breadth-first % test_prod( Data_j, Result ) . % breadth-first test_prod( [Symbol | Data_i], [Symbol | Result] ) :- test_prod( Data_i, Result ) . %apply_prod/3 % % apply_prod( P, D, R ) : R is the tail of a sentential form (represented % as a list of symbols) obtained from the tail of the % sentential form at D by removal of those leading % elements (symbols) that match the right-hand side % symbols P of the production being applied. % apply_prod( [], Data_i, Data_i ) . apply_prod( [Symbol | Symbols], [Symbol | Rest], Data_i ) :- apply_prod( Symbols, Rest, Data_i ) . %show_steps/1 % % show_steps( L ) : L is a list of sentential forms, each of which is a % list of symbols derived during a recognition. % show_steps( [] ) :- write('----------'), nl . show_steps( [Step | Steps] ) :- write( Step ), nl, show_steps( Steps ) . %prod/2 % % prod( L, R ) : L is the left-hand side symbol of a production and R is a % list of the right-hand side symbols. % prod( s, [np, vp] ) . prod( np, [det, n] ) . prod( vp, [v, np] ) . prod( vp, [v, np, pp] ) . prod( n, [cat] ) . prod( n, [rat] ) . prod( v, [ate] ) . prod( det, [the] ) . prod( p, [in] ) . prod( n, [road] ) . prod( pp, [p, np] ) . prod( np, [np, pp] ) .