Linguistics 482 - Computational Linguistics Prolog Notes A. C. Brett
Department of Linguistics
University of Victoria
Clearihue C139
Last updated: 20 October 2005


Variables are morpheme-like entities in Prolog that function as place holders for information that is unknown at some stage in the processing of a statement, but which will normally become available at a later stage of the process.

Variables are represented by sequences of one or more upper and lower case letters, as well as numbers, which may include underscore characters, but which must begin with either an upper case letter or an underscore, as illustrated in the following examples:

The place holding function of a variable can be illustrated with the following example involving an arithmetic expression:
   X is 3 + 4 .
wherein the result of the addition operation is unknown until the arithmetic expression is evaluated. In this case, the variable X is employed to identify a location at which the result will be placed when the addition is performed. Prior to evaluation of the expression, X has no value associated with it; a variable in these circumstances is said to be uninstantiated. When the expression is evaluated, the result of the addition is instantiated on the variable X. This operation may be construed as storing the result at the location identified by the variable name. Note that there is nothing special about use of the name X in this context, outside of its conventional use in the injunction: "Let X be the unknown." The variables Sum or Result, or any other variable, would serve as well.

In the foregoing example, a number of integer type will be instantiated on the variable. Any term may actually be instantiated on a variable. For example, if one were to type

   X = 3 + 4 .
then the expression 3 + 4 itself would be instantiated on X. The equal sign in this context is a special atom that names the unification operator. Since X is initially uninstantiated, this predicate effects the instantiation of the term 3 + 4 on it. This term is in fact a structure with the name + and two arguments, in this case, 3 and 4. This structure is not evaluated unless it is the right-hand or second argument of the is predicate.

The storage of a term at a location identified by a variable, and the subsequent recovery of the term from that location, is demonstrated by the following example:

   Duh = 3 + 4, Oh is Duh.
consisting of two goals. In the first, the expression 3 + 4 is instantiated on the variable Duh. In the second, which is separated from the first by a comma, the expression 3 + 4 is recovered from Duh. The expression is evaluated, and the resulting sum is instantiated on the variable Oh.

A slightly more pertinent illustration of the foregoing processes may be demonstrated in an example wherein a variable is employed to enforce determiner and noun number agreement in a noun phrase. This example is based upon a Prolog program that includes the following clauses:

   det(this,  sg).
   det(these, pl).
   det(the,   _ ).
   det(a, sg).

   noun(cat,  sg).
   noun(cats, pl).

   np(X, Y, Num) :- det(X, Num), noun(Y, Num).
The variables X and Y in the arguments of the np(X, Y, Num) structure in the head of the np/3 rule serve as place holders for the two tokens that can constitute a noun phrase. The variable Num serves as a place holder for a number feature value. The variables in the arguments of the det(X, Num) and noun(Y, Num) structures in the tail of the rule serve the same function, with X standing in place of a determiner, and Y in place of a noun, with Num standing in place of a number feature value in both structures.

If the following goal is typed (after the Listener has been started and the program containing the foregoing clauses has been consulted):

   np(these, cats, Agr).
the Listener will unify this goal with the np(X, Y, Num) structure. In the process of this unification, the atoms these and cats will be instantiated on the variables X and Y, respectively. The two variables Agr and Num will unify and become synonyms for the same variable.

When the det(X, Num) goal is proved, the atom these is, in effect, recovered from the location named by the variable X, and the Listener unifies the resulting goal, det(these, Num) with the structure det(these, pl) among the facts in the program. As a result of this unification, the atom pl gets instantiated on the variable Num. Then, when the noun(Y, Num) goal is proved, the atoms cats and pl are, in effect, recovered from the locations named by the Y and Num variables, respectively. The goal becomes noun(cats, pl), which can be unified with one of the facts in the program. With return to the goal originally typed, namely, np(these, cats, Agr), since Num and Agr have been unified, the atom pl is displayed as having been instantiated on the variable Agr.

The fact that the variable Num does indeed enforce number agreement between the determiner and the noun can be demonstrated by considering what would happen if the following goal were typed:

   np(these, cat, Agr).
The atom pl will be instantiated on Num, as described above, when the det(these, Num) goal is proved. Consequently, the Listener will attempt to prove the goal noun(cat, pl), but will fail because this goal cannot be unified with a fact in the program.

Both of the following goals:

   np(the, cat, Agr).
   np(the, cats, Agr).
will succeed, however, because of the anonymous variable, represented by the underscore character, _, in the second argument of the det(the, _) fact. The Listener generates a unique internal variable whenever it encounters an instance of the anonymous variable. In this example, when the det(X, Num) goal is proved, Num unifies with one of these generated internal variables; but, since neither it nor Num has been instantiated, no value is instantiated on Num. The variable Num then gets instantiated only when the noun(Y, Num) goal is proved, with the result that no number agreement checking takes place. The atom intantiated on Num, either pl or sg, depending on whether you typed cat or cats in the goal to be proved, will still be reported by the Listener as having been instantiated on the variable Agr.

The anonymous variable noted above serves strictly as a place keeper. It can be used in situations such as that illustrated here, or in any similar circumstance wherein whatever value might be instantiated on it is of no interest.

The Listener actually generates its own internal identifiers for all the variables in a program. These identifiers may be displayed, as either H with a suffixed number, or as an underscore with an affixed number, when a variable cited in a goal that is typed does not get instantiated when the goal is proved.

The Listener, of course, uses the same internal identifier for two or more references to the same variable within one clause; however, it generates different identifiers for variables which might have the same name but are in different clauses. Thus, although you might use the same variable names in two or more different clauses, the Listener treats the names in the different clauses as different variables. Consequently, the instantiation of a variable in one clause does not cause the instantiation of a variable for which you have used the same name in another clause.

Linguistics 482 Prolog Introductory Notes Top of Page