Language, Linguistics, Logic, and Life . . . . . . . . . . . . . by Teresa Elms

  • Subscribe

  • Legal

    Copyright © 2008, Capitalist Dawg Enterprises™, running dog lackeys of capitalist imperialism since 1954. All rights reserved. See the Legal page for terms and conditions of use.

Archive for February, 2008

Nonlinearity in Language: Chomsky Was Right

Posted by Teresa Elms on 2 February 2008

When we talk about “nonlinear systems” today, we generally mean complex dynamic systems that have self-organizing properties under conditions near the boundary of chaos. “Chaos theory” is the popular shorthand term for the study of such systems. But “nonlinearity” can be used in a different technical sense to refer to dimensionality. A linear system is, in this sense, one-dimensional. A nonlinear system may be two-dimensional, three-dimensional, or n-dimensional; that is, a nonlinear system has dimensionality greater than one. (This definition happens to work for fractional dimensionality as well as integer dimensionality, which will be convenient for us later.)

As it happens, Noam Chomsky twigged to the multidimensionality of language rather early. In a paper published back in 1956 by the Institute of Radio Engineers (now the IEEE), Chomsky demonstrated that human language, with its embedded constituent hierarchy, is inherently nonlinear in the dimensional sense. He then used this finding to discriminate among three possible models of language generation: (1) Markov processes; (2) phrase structure grammars; and (3) transformational grammars.

Transformational grammars won, of course.

But to me, the nonlinear structure of language is a fact of greater import than any theoretical conclusions it might have supported in 1956. To see why, it may be useful to recreate the fundamental insight.

Imagine that you are one of several beads on a string. Each bead represents the current output of a production system that emits language one unit at a time. All the beads are roughly the same in size or granularity or scale; that is, they all consist of phonemes, or morphemes, or words, or similar units of coded language output. The scale is not important, so long as the same scale is maintained throughout the production process.

Now, being a bead on a linear string, you can exchange information with the beads immediately before or after you, but there is no way for you to peak around these adjacent beads to learn about the beads further up and down the line. There is no “up” or “down” or “left” or “right” in which you can extend a head or hand and take a peak. You can detect which morpheme is carried by the bead in front of you. You can detect the fact that the bead behind you is empty. You can use these two facts and some internal transitional probabilities to generate a new morpheme to fill the empty bead that follows. But there are certain things you can’t do. For example, you can’t:

  • Repeat the contents of the previous n beads. (After all, you can’t see back n beads.)
  • Emit the contents of the previous n beads in reverse order. (Again, you can’t see back n beads.)
  • Repeat the content of the immediately previous bead n times. (After you emit the first repetition, you can’t see back beyond it to count how many times you’ve repeated it, so you never know when to stop.)

These restrictions derive from three factors inherent in the nature of information and linearity:

  1. Linear, point-to-point connectivity restricts the  information flow that can occur in the world.
  2. A bead is finite in size, and so has a finite (possibly very small) memory for the contents of preceding beads or following beads; that is, there is a restriction on the storage of information at any one point in the world. 
  3. A “bead reader/recorder” that sees all beads in sequence might have a memory for any sequence that passes through it, even an infinite memory — but any such memory must exist outside the linear bead world, as does the bead reader/recorder system itself.

Consequently, if you happen to believe the human language system is capable of repeating a prior sequence of a certain length, or of repeating the immediately preceding form n times, or of inverting the order of a prior sequence, you are tacitly acknowledging the nonlinearity of language. I think that’s wonderful.

Chomsky doesn’t state these notions in terms of dimensional connectivity and information flow, like I do, but the facts are implicit in what he does say. Chomsky’s full paper is available online at–.pdf. Citation information is available on the IEEE Web site, and IEEE members can get free full-text access as well, at

Posted in Linguistics | 3 Comments »