% Linguistics 484 -- Example: DCG_P.TXT % % DCG_P.TXT: A Definite Clause Grammar, implemented in a fashion % such that the Prolog interpreter functions as a parser. % % This example represents a modification of the previous example, an % implementation of a Definite Clause Grammar (DCG) which results in % the Prolog interpreter behaving as a recogniser. In this % implementation, however, the interpreter functions as a parser, a % parser being a recogniser which produces structures of the sentences % it accepts. As is the case with most parsers implemented in Prolog, % structures are represented as lists. % % While the recognition proceeds in the top-down direction (as is % "natural" for Prolog implementations of DCGs), the representation of % the structure of a sentence is built in the bottom-up direction, % beginning with the "structure" of the lexical item-category pairs. % % In this implementation of a DCG parser, lexical item-category pairs % are represented as a list of two elements, [C, [X]], wherein C denotes % the label of the category of the lexical item X, a token (word). % For example, if X were instantiated with the value "some," in the % lexical category with label DET, then the list would appear as % [det, [some]], wherein the category label DET is transcribed as a % Prolog string, which must begin with a lower-case alphabetical % character. For a parse of the sentence "some dogs like cats," the % remaining three lexical category-item pairs would be represented by % lists [n, [dogs]], [v, [like]], and [n, [cats]]. % % These list transcriptions of subtrees (just leaves and petioles, % really) are built in those assertions of the of the DCG which match % data tokens (from the input sequence) with items in the lexicon (of % the parser). The assertions which test this matching (of data token % with lexical item) are goals in the bodies of assertions of predicates % corresponding to phrasal categories, and when a match succeeds, the % goal returns a subtree transcribed in the form of a list such as those % illustrated above. For example, the assertions which test the match % for "some" and "dogs" are goals in an assertion of a predicate % corresponding to a noun phrase, and when these goals succeed, they % return the lists [det, [some]] and [n, [dogs]] to the noun phrase % assertion. This assertion then builds the following list, headed with % the category label: [np, [det, [some]], [n, [dogs]]]. % % The noun phrase assertion is itself a goal in the body of an assertion % of a predicate corresponding to another phrasal category and, when it % succeeds, it returns a list such as [np, [det, [some]], [n, [dogs]]] to % this assertion. For the sequence of data tokens "some dogs like cats" % another assertion of the noun phrase predicate will return the list % [np, [n, [cats]]]. In this case, the noun phrase predicate is a goal in % the body of an assertion of a verb phrase predicate. The verb phrase % assertion will include a goal which tests for a lexical verbal % category that will return [v, [like]]. This list will be combined with % [np, [n, [cats]]] to form [vp, [v, [like]], [np, [n, [cats]]]], a % with the verb phrase category label, which is returned to the % assertion containing the verb phrase predicate in its body. % % The predicate which includes an assertion with both noun phrase and % verb phrase goals in its body likely corresponds to the sentence % category. In this example, the assertion of the sentence predicate % will combine the lists [np, [det, [some]], [n, [dogs]]] and % [vp, [v, [like]], [np, [n, [cats]]]] to form the following list, which % headed with the sentence category label: % [s, [np, [det, [some]], [n, [dogs]]], [vp, [v, [like]], [np, [n, [cats]]]]] % % In connection with the foregoing description, note that in most % presentations, the list representation of a lexical category-item pair % is written as [C, X], rather than as [C, [X]], so that a list such as % [det, [some]] will be written as [det, some]. The (some would say) % unconventional notation used here is employed for two reasons. % % In the first place, transcribing the lexical item as a singleton list % encodes the immediate dominance, mother-daughter relationship as it is % normally portrayed in the list representation of a tree structure: % the mother category label is written as a string at the head of the % list, while daughter category labels (or daughter subtrees) are % written as lists, as has been illustrated above. Extension of this % protocol to the lexical category-item pair is thus natural and % consistent. % % In the second place, writing the lexical item as a singleton list % facilitates processing of the list to display the structure as an % "indented vertical list." Predicates which yield an indented vertical % list display, given a Prolog list data structure, will be included in % subsequent examples of parsers. % %s/1 The purpose of this predicate is to test that the s/3 % predicate returns the null list as the value of its second % argument, and to invoke the write/1 system predicate to % display the structure (in the form of a list) of a % sequence of tokens accepted as a sentence. % % s(L) : L constitutes a sentence if all the tokens of L % are accepted by s/3, leaving the null list. % Parse, a list representing the structure of the % sentence, is displayed. % s(L) :- s(L, [], Parse), write(Parse). % %s/3 s --> np vp % % s(L, T, P) : The elements of the list L comprise a sentence, % with the structure P, if the initial elements % comprise a noun phrase, and the remaining % elements comprise a verb phrase, leaving the % null list at T. % % Lists representing the structures of the noun phrase and % the verb phrase are returned by the np/3 and vp/3 goals as % the values of the variables NP and VP, respectively. These % lists are included as elements in the list appearing in the % third argument of the s/3 goal. This list, the head of % which is the string s, is returned by the s/3 predicate to % to the assertion of the s/1 predicate. % % s(L, T, [s, NP, VP] ) :- np(L, R, NP), vp(R, T, VP). % %np/3 np --> det n | n | pro % % np(L, T, P) : The initial elements of the list L comprise a % noun phrase if -- % - The first element is a determiner and the % second is a noun; or, % - The initial element is a noun; or, % - The initial element is a pronoun. % T is the tail of L, following removal of those % elements comprising a noun phrase, and P is the % structure, represented as a list, of the noun % phrase, a subtree rooted on the NP label. % % Structure of the noun phrase, represented as a list, is % constructed by including terms containing the structures % (subtrees) corresponding to the lexical categories DET and % N, or N, or PRO. % % np([X, Y | R], R, [np, DET, N] ) :- det(X, DET), n(Y, N). np([X | R], R, [np, N] ) :- n(X, N). np([X | R], R, [np, PRO] ) :- pro(X, PRO). % %vp/3 vp --> v | v np % % vp(L, T, P) : The initial elements of the list L comprise a % verb phrase if -- % - The initial element is an intransitive verb; or % - The initial element is a transitive verb, and % the following elements comprise a noun phrase. % T is the tail of L, following removal of those % elements comprising a verb phrase, and P is a % list representing the structure of the verb % phrase, a subtree rooted on the label VP. % % Structure of the verb phrase is constructed, and returned % in the third argument of the vp/3 predicate, by including % terms representing the structures of the constituents of % of the verb phrase. % vp([X | T], T, [vp, V] ) :- v(X, in, V). vp([X | R], T, [vp, V, NP] ) :- v(X, tr, V), np(R, T, NP). % %n/2 n(X, P) : X is a noun and P is the structure of the subtree % rooted on the label N. % n(X, [n, [X]] ) :- lex(X, n). % %pro/2 pro(X, P) : X is a pronoun, and P is the structure of the % subtree rooted on the label PRO. % pro(X, [pro, [X]] ) :- lex(X, pro). % %det/2 det(X, P) : X is a determiner, and P is the structure of % the subtree rooted on the label DET. % det(X, [det, [X]] ) :- lex(X, det). %v/3 v(X, in, P) : X is an intransitive verb; % v(X, tr, P) : X is a transitive verb; and, % P is the structure of the subtree rooted on % the label V. % v(X, in, [v, [X]] ) :- lex(X, v, in). v(X, tr, [v, [X]] ) :- lex(X, v, tr). %lex/2 % % The assertions comprising the lex/2 predicate in this example are as % follows: % % lex( X, C ) : Lexical item X is in category C. % lex(cat, n). lex(cats, n). lex(dog, n). lex(dogs, n). lex(play, n). lex(plays, n). lex(ball, n). lex(balls, n). lex(she, pro). lex(her, pro). lex(he, pro). lex(him, pro). lex(they, pro). lex(them, pro). lex(it, pro). lex(some, pro). lex(many, pro). lex(most, pro). lex(a, det). lex(an, det). lex(the, det). lex(some, det). lex(many, det). lex(most, det). %lex/3 % % The assertions comprising the lex/3 predicate in this example are as % follows: % % lex( X, C, SC) : Lexical item X is in category C % with subcategorisation SC. % lex(like, v, tr). lex(likes, v, tr). lex(play, v, tr). lex(plays, v, tr). lex(play, v, in). lex(plays, v, in). lex(run, v, in). lex(runs, v, in).