Thursday, November 17, 2011

Inflection, Derivation and Compounding


Inflection, derivation and compounding:
preliminaries


In this section I briefly introduce certain important notions which will figure widely later: inflection, in which we create word forms of lexemes (such as the plural or past tense), derivation, in which we create new lexemes from old lexemes, and the compound word, a single word formed by combining two other words. We begin with compounds.

The most straightforward type of compound simply consists of two wordsconcatenated together: morphology + article = morphology article; house + boat = houseboat. The right-hand member is the head of the compound, determining the syntactic category and meaning of the whole (a morphology article is a
kind of article, a houseboat is a kind of boat, as compared with a boathouse, which is a kind of house). The left-hand member is the modifier. In transparent cases such as morphology article the meaning of the whole is derived from the meanings of the components (though the precise meaning is indeterminate and depends on the context of use).

There is an important distinction in many languages between compounds and phrases. In many cases the difference is obvious. In a hackneyed example such as blackbird as opposed to black bird the compound has stress on black, while the phrase is stressed on bird (in neutral contexts at least). Moreover, a black bird is necessarily black, while a blackbird is a particular species of bird whatever its colour (female blackbirds are brown, for instance). This means that the semantics of this compound is non-compositional, i.e. we can’t determine the meaning of the whole just from the meanings of the parts. The semantics of phrases (idioms apart) is compositional. The difference can be illustrated syntactically as in (2, 3) (making very conservative assumptions about syntactic structure):


This is the standard story, though there are interesting subtleties. For instance, there is no way of determining the syntactic category of the modifier in blackbird, because it is fixed as part of the compound and can’t be subjected to any of the morphological or syntactic manipulations that real adjectives can. Thus, compare (4) and (5):

Pictures 2 -3 

Moreover, black doesn’t mean “black” in blackbird (because a blackbird doesn’t actually have to be black). Thus, the modifier black has neither category nor meaning; it just has a bare morphophonological shape. Therefore, (3) should be rewritten as (6):


The point is that blackbird is a lexicalized compound whose internal structure is only of historical significance, unlike a non-lexicalized coinage such as morphology article. In time, with changes in pronunciation, even this historical structure becomes opaque. Thus, husband is derived etymologically from (modern) “house” and “bond,” but it isn’t recognized as a compound by anyone except students of Middle English. Nonetheless, noun + noun compounding is a fully productive process in English. Simplifying somewhat, we can say that a process is productive if it applies freely in principle to all the lexemes of the language of the relevant type, allowing new forms to be created at will even if they have never been used before. Such processes therefore have to be semantically regular, without any lexicalized idiosyncrasy of meaning, otherwise, hearers would have no way of knowing what a new coining was supposed to mean (see Aronoff andnshen 1998, for more detailed discussion). The meaning of such compounds is admittedly vague: a morphology article is an article which has some connection with morphology. On the other hand, adjective + noun compounds aren’t productive and there are virtually no verb + noun compounds (there is a tiny handful of exceptions like swearword and drawbridge).

A variety of types of productive compounding are known in the languages of the world. A particularly interesting type, which has been the subject of some debate in recent years, is that known as noun incorporation (see Mithun 1984). In noun incorporation we see an alternation in which the direct object of a verb may form a compound with that verb. In (7) we see two examples from Chukchee (a member of a small language group spoken in northeast Siberia):

Pictures

In (7a) the subject pronoun is in the ergative case (the case used to mark the subject of a transitive sentence), while the object is in the absolutive case. Being transitive, the verb agrees with both the subject and the object. In (7b) the root of the object noun has formed a compound with the verb root. This renders the verb intransitive, so it agrees solely with the 1st person subject. The subject pronoun is now in the absolutive case, the case used for intransitive subjects. Finally, notice that the 1sg prefix comes to the left of the incorporated noun root and the vowels of the root have changed. This is due to vowel harmony, under which the “weak” vowel /i/ is changed to /e/ when there is a “strong” vowel elsewhere in the word (e.g. the /o/ of pojgr). Vowel harmony only operates within a word, and this helps us identify the incorporative complex as a single word form morphologically. Examples (7a, b) differ slightly in emphasis but are otherwise synonymous. Thus, it is clear that pojgr still realizes the “spear” lexeme even when it is compounded. Noun incorporation is completely productive in Chukchee, with very few restrictions.

Turning to derivation, the nouns writer, painter, walker are clearly related to the verbs write, paint, walk, meaning roughly “person who writes, paints, walks,” by suffixation of -er. I shall call these subject nominals. It is customary to treat write and writer as distinct lexemes related by derivation, rather than word forms of a single lexeme. For instance, writer is a noun, while write is a verb. The morphological operations which realize derivation (such as -er affixation) may or may not be regular and productive. Thus, apply has a subject nominal applic-ant, with irregular suffix -ant added to an irregular form of the root, applic-. I discuss derivation in more detail in section 3.1.

As a verb lexeme, write has its own set of grammatical words expressed by the forms write, writes, writing, wrote, written. Similarly, writer has its own set of forms: writer, writers. These grammatical words are the inflected forms of the lexeme and the process of constructing inflected forms is known as inflection (“inflectional morphology”). The meanings of the inflected form are predictable (plural of noun, past tense of verb, or whatever), while the shape of inflected forms is generally determined by affixation to the stem form of the lexeme. The stem consists of the root and any derivational affixes. In morphologically complex languages a given lexeme might have several stems for different types of inflection (for example, all verbs may have separate present tense and past tense stems). Irregularity, either in the stem or the affix, is not uncommon. Thus, knife has the irregular stem form knive- in the plural (knives), while ox has the irregular suffix -en (oxen). Irregularity of form can be complete as in total suppletion, when one inflected form bears no shape relation to the rest of the paradigm (e.g. went as the past tense of go). Where there is still some overlap we talk of partial suppletion (as in brought ~ bring, where the first two consonants are identical). Even where the shapes are irregular, the past tense meaning is exactly the same as it is for any other verb, whether irregular (such as write ~ wrote, bring ~ brought, go ~ went) or regular (e.g. scribble ~ scribbled).

Inflections express grammatical or functional categories. The inflectional system organizes the forms of words into systematic groupings or paradigms. There are essentially two sorts of function subserved by inflection. Many inflections signal an aspect of meaning which is grammaticalized, such as number (singular vs. plural) or tense. This means that the words of a given class obligatorily signal the grammatical distinction: thus, all verbs in English have to have a past tense (even if these are not actually distinct forms, as in put). Booij (1994) refers to this as inherent inflection.

One typical inherent inflection for nouns is case, in which the grammatical or semantic role of a noun in a sentence is shown by its form. In Russian a noun generally has distinct forms for the subject, direct object or indirect object:

(8) Len -a dala Ir -e knig -u
Lena-NOMINATIVE gave Ira-DATIVE book-ACCUSATIVE
“Lena gave Ira a book.”
Lena, Ire, knigu in (8) are case-inflected forms of the lexemes lena, ira, kniga. Verbs exhibit much greater variety in their inflectional systems. Two common inherent inflections are tense and aspect. Tense refers to anchoring in time, as with English wrote (past) as opposed to writes (non-past – present or future reference). A given language may distinguish a number of different tenses (such as recent vs. remote past) or no tense at all. Aspect refers to the manner in which an event unfolds over time. A very common aspectual distinction is that between completed (perfective) and non-completed (imperfective) events. In Slavonic languages most verbs have separate perfective and imperfective paradigms, e.g. op’isat’ (perf.) ~ op’isivat’ (impf.) “describe” (see also section 3.2). Many languages have very rich aspectual markings modifying the meaning of the base verb in very subtle ways. Below is just a small selection of the fifteen aspectual affixes described for Chukchee by Skorik (1977: 179–202):

(9) -l?et prolonged continuous action:
?@tt?e ninepièku-lqet-qin . . . ott@lg@n
dog jump-ASP-3/3 stick
“The dog jumped over the stick over and over again.”

(10) -cir prolonged interrupted action:
. . . èinqejmuri n?ejèew-cir-muri jaral?a
us.children called-ASP-1plOBJ people.at.home
“The people at home kept calling us children.”

(11) -cit / -cet alternating action:
. . . natc@-cet-qenat . . .
hide-ASP-3plSUBJ
“They played at hide-and-seek”

(12) -sk@cet accelerated action:
q@nwer è@to-sqrcat-g?e g@mnin t@letumgin
at last come.out-ASP-3sgSUBJ my companion
“At last my companion sprang out”

More than one of these can be combined:

(13) m@t-ra-t@la-tenmaw@-pl@tko-èèo-g?a
1pl-FUT-GRADUALLY-prepare-FINISH-BEGIN-FUT
“we will begin to gradually finish the preparations”

Other types of verb inflection include mood (whether a statement is presented as fact, possibility, hypothetical situation and so on) such as the subjunctive mood of Romance languages, the optative expressing a wish (e.g. Ancient Greek), imperative for issuing commands, and interrogative, a special set of verb forms used for asking questions (e.g. the Eskimo languages). Many language groups signal polarity (negation) inflectionally (Bantu, Turkic, Athapaskan, and others). It is very common for a given inflectional morpheme to signal a complex mixture of tense, aspect, mood, and polarity.

Any of the above functional categories can be expressed syntactically, by word order or by function words such as the English aspectual auxiliaries (has been reading). One purely morphological type of inherent inflection is inflectional class: declensions for nouns and adjectives and conjugations for verbs. Which noun or verb goes in which class is in general arbitrary. Russian nouns can be put into four main declensions depending on the inflections they take (though different descriptive traditions distinguish different numbers of declensions):

Pictures 

(The symbol ’ represents palatalization. Consonants are always palatalized before /e/. The case names are traditional and represent a variety of syntactic functions.) I have given two subtypes of class I nouns, one animate the other inanimate. In the inanimates the accusative case is always the same as the nominative, while in the animates the accusative takes the form of the genitive. This type of situation, in which parts of a paradigm are systematically identical, is known as syncretism. There are other syncretisms here, too. For instance, the dative, instrumental and prepositional plural endings are the same for all classes, that is, the class feature is neutralized and there is effectively a single set of endings for the whole of the class “noun.” On the other hand, the behavior of pairs such as “law” and “boy” require us to set up a covert category of animacy for Russian, which never has any direct expression (there is no form which has a suffix identifiable as the “animacy” suffix) but which is nonetheless part of the inflectional system. Note that it is the property “animacy” which is covert, not the accusative case. We know this because class II nouns have a separate accusative, in the singular at least (see Corbett and Fraser 1993, for more detailed discussion of the implications of these data).

Russian verbs inflect so as to indicate the person and number of their subject (see below on “agreement”) as well as for tense and occur in two main conjugations (together with a plethora of minor variations on each of these classes): 

Pictures

As can be seen, the endmost suffixes are common to both classes, except in 1sg, 3pl forms. Both types have a special stem forming suffix, -aj- and -irespectively, and class I has in addition a “linking vowel” -o-. The -aj/-i formatives are found throughout the inflectional system of the verbs. The other role of inflection is to realize the syntactic functions of agreement and government. This is what Booij (1994) calls contextual inflection, because it is determined by the syntactic context in which the lexeme is used. In many languages a verb must agree with its subject and / or object, by cross-referencing various of their properties. This occurs marginally in English for third person non-past verb forms: Harriet writes vs. the girls write. In Chukchee transitive verbs agree with both the subject and the object, in rather complex ways. The system for one of the six tense forms in the indicative mood is shown in Morphology 221
(16) (see Muravyova 1998; empty cells represent non-existent forms in which
the subject and object would have the same person features):

Pictures 

The verb references the person and number both of the subject and of the object, though there is no simple relationship between many of the affixes and their functions. Thus, although the prefixes tr- and mrt- clearly meaning “1sg/1pl subject” respectively, the prefix na- seems to mean “3pl subject” or “3sg subject with 2nd person object or 1pl object” and the suffix -nen seems to mean “3sg object but only if the subject is 3sg.” One consequence of this is that some forms correspond to more than one subject–object pairing, e.g. napelagrt, which means either “3sg leaves 2sg (s/he leaves thee)” or “3pl leaves 2sg (they leave thee).” The system proves to be even more complex than this when the full set of tenses, moods, and voices is taken into account. Patterns such as this are typical of languages with rich agreement systems, and such data have been instrumental in changing the views of linguists about the nature of the morpheme.

Adjectives often agree with the nouns they modify. This is extremely marginal in English, only being found for this and that (this / that cat vs. these / those cats). In Russian, however, an adjective agrees with its noun in number and case:

Pictures 

It might be thought that the adjective agrees in declension, but this is wrong. All nouns in Russian have one of three genders, masculine, feminine, or neuter. Male and female humans are masculine and feminine respectively and for other nouns gender depends largely on declensional class. Members of class I are masculine, those of classes II, III are feminine and those of class IV are neuter. However, there are certain exceptions. Thus, the word muokina “man” belongs to class II, yet it is masculine: bol’moj muokina “big man.” As is stressed by Aronoff (1994), gender is an essentially syntactic property, which governs agreement. Declension class is a purely morphological property which the syntax has no direct access to. Aronoff points out that the existence of arbitrary inflectional classes is one of the prime motivations for treating morphology as an autonomous linguistic module.

We have seen that a direct object in Russian is in the accusative case. This can be thought of as an instance of government: a transitive verb governs the accusative. Likewise, prepositions in Russian have to take specific cases, as shown in (19):

Pictures 

Notice how “motion towards” as opposed to “location at” is signaled solely by case choice in (19b, c), otherwise, it is an arbitrary matter which preposition governs which case. One of the perennial theoretical problems in morphology is whether there is a clear-cut distinction between inflection and derivation and if so how to draw it. Inflection is often thought to be “of relevance to syntax,” which is clearly true of contextual inflection, but not so obvious with inherent inflection. Yet we don’t want to say that plurals or past tenses are derivational and hence create new lexemes. Booij’s contextual / inherent distinction is designed to ameliorate this problem (though we are now left with the task of distinguishing inherent inflection from derivation). A typical borderline case is that of the aspectual forms of Chukchee given above. Chukchee has a set of six tense-aspect forms in which aspect (roughly perfective vs. imperfective) is Morphology 223grammaticalized and expressed as part of the obligatory conjugation system.

However, the affixes illustrated in (9–13) are not like this. Rather, they are optional elements which are added to modify the overall meaning of the verb.Does this make them derivational, then? Do we wish to say that “to verb in a prolonged interrupted fashion” is a new lexeme related to verb (derivation) or a form of the word verb (inherent inflection)? Cases like this are quite common and promise to provide fertile ground for future research into the problem.


The Lexeme Concept


The lexeme concept
If we ask how many words are listed in (1) we can give at least two answers
(1) {cat, cats}

In one sense there are obviously two, but in another sense there is only one word, cat, and only one entry will be found in a dictionary for it. The plural, cats, is formed by a completely general rule from the singular form cat and there is no need to record the plural form separately. In addition, we can describe cat as “the singular form of the word cat” and cats as “the plural form of the word cat.” This gives us another interpretation for the term “word,” as becomes clear when we look at the word “sheep.” Here the singular form of the word sheep has exactly the same shape as the plural form, even though these are distinct linguistic entities. Given the vagaries of English orthography, this identity of shape can be true of the spoken form, the written form, or both (as with “sheep”). Thus, the written shape of the base form of the verb “read” (pronounced like “reed”) is identical to that of the past tense, “read” (pronounced like “red”) despite the difference in pronunciation, while the taxes, the tax’s (“of the tax”) and the taxes’ (“of the taxes”) differ solely in spelling.

It is rather useful to have different terms for these three different senses of the word “word.” We will therefore say that there is a lexeme cat which has two word forms, cat and cats. The names of lexemes are conventionally written in small capitals. The grammatical description “the singular / plural of cat” is a grammatical word. Thus, sheep is one word form corresponding to one lexeme, but it is two grammatical words (the singular and the plural of sheep).

We can think of a lexeme as a complex representation linking a (single) meaning with a set of word forms, or more accurately, linking a meaning with a set of grammatical words, which are then associated with corresponding word forms. From the point of view of the dictionary (or lexicon), this is therefore a lexical entry. There is no demand here that the set of forms correspond to only one meaning, or that only one set of forms correspond to a given meaning. If several forms correspond to one meaning we have pure synonymy: e.g. {boat, boats}, {ship, ships}. If a single form corresponds to more than one completely unrelated meaning, as with {write, right, rite}, or {bank, bank} then we have homophony or homonymy. We then treat the homophones /homonyms as distinct lexemes which just happen to share the same shape (written and / or spoken). In some cases these meanings are felt as related to each other, and we have a case of polysemy. Thus, the word “head” means a body part, the person in charge of an organization, a technical term in linguistics, and so on, and these meanings are associated by some kind of metaphorical extension. In general, polysemy tends to be either ignored (where the meanings are close) or treated like homophony. 

In linguistics a form-meaning pair is a sign and the lexeme is a prototypical example of a sign. The traditional definition of morpheme is “the smallest meaningful component of a word,” and this entails that we consider all morphemes as signs. However, this turns out to be very controversial, for some types of morpheme, at least.

Morphology by Andrew Spencer


Morphology is about the structure of words. All languages have words and in all languages some words, at least, have an internal structure, and consist of one or more morphemes. Thus, the form cats comprises the root morpheme “cat” to which is added the suffix morpheme “s” indicating plural. Now, for this characterization to mean anything we have to know what a word is. How do we know, for instance, that a string such as the cat is two separate words, and that the is not a prefix? Conversely, how do we know that the “s” of cats isn’t a word in its own right. Here we need the help of syntax: the cat is a phrase which can be extended by addition of other phrases: the very black cat. The form cats can never be split up this way, the reason being that the “s” component is an element which can only exist as part of a word, specifically at the end of a noun. In other words, “s” is a suffix and hence a bound morpheme. The property of indivisibility exhibited by cats is lexical integrity. A single word such as cats contrasts rather neatly with the fully fledged (but synonymous) phrase more than one cat, in which it is clear that more, than, and one are all independent words and can all be separated by other words or phrases. 
This chapter will examine the different structures words exhibit and the morphological relationships they bear to each other, and the nature of the morpheme. We begin by clarifying the notion “word” itself.