|
The majority of computationally-oriented psychological models, albeit certainly not all such models, are based on phrase-structure grammars. Furthermore, people are generally more familiar with phrase-structure grammars than with other formalisms. Therefore, the emphasis of this course will be on varieties of phrase-structure grammar and the processing models based on these frameworks, although processing models based upon other approaches to natural language syntax such as the categorial grammars will also be discussed.
The processing models and examples presented during this course consist of Prolog programs. Therefore, we begin with a review of Prolog language concepts such as lists, recursion, and unification introduced in Linguistics 482 that are of particular relevance to the topics dealt with in Linguistics 484.
Note, however, that you will not be required to write Prolog programs in Linguistics 484. Assignments involving Prolog will require only that you make minor modifications to existing programs, and no extensive knowledge of Prolog is needed to make these modifications. Our review of Prolog is directed more toward enabling you to follow the examples and thereby appreciate the processing concepts and strategies being illustrated.
Many of the concepts presented in Linguistics 484 are described in terms of the language and notation of set theory. We therefore review those aspects of set theory and the attendant notation required to follow the presentation and achieve understanding of these concepts.
Among the concepts discussed during this course is the notion of a procedure. In the context of natural language processing, procedures usually consist of detailed, step by step descriptions of the operations entailed in applying a grammar to recognise strings of words as sentences. Such procedures can form the basis of cognitive models of human language processing. It is also generally accepted that sufficiently well-defined procedures can be represented in terms of the operations performed by abstract computers called automata.
Particular varieties of automata, such as the finite-state and the push-down machines, can be demonstrated to be equivalent to natural language grammars, such as the finite or regular grammars and the context-free phrase-structure grammars. Equivalence in this context means that, for example, a push-down automaton can be devised that accepts just the sentences generated by a given context-free grammar. We also examine the relationship of the context-free grammars to the regular, the context-sensitive, and the unrestricted phrase-structure grammars in the Chomsky hierarchy, and we discuss the machines to which these grammars are equivalent.
The study of these machines and their operations facilitates assessment of the computational resources required to process natural language sentences. These assessments shed light upon the resources and processes that might be employed by the human language processing system. For example, limits on the resources normally available to the human processor can explain why some sentences are more difficult to analyse than others, and the imposition of constraints on the resources available to these machines can be used to model human language processing deficits.
We examine a variety of different parsing strategies that have been
applied in psychological models of the sentence analysis process
such as the
top-down, structurally-driven
procedures and the
bottom-up, data-driven procedures.
We also discuss particular parsing procedures such as the
chart parsers
and procedures based on
transition network grammars.
The chart parsing procedure forms the basis of most parsing programs
used today in natural language processing applications; but it also
can be applied to model aspects of parallel processes in sentence
analysis.
The transition network procedures are important chiefly for historical
reasons, although they have been applied in some significant
psychological models of sentence processing.