The Recovery of Case | David Berlinski & Juan Uriagereka

It is April 1977. Noam Chomsky and Howard Lasnik are about to publish an important essay in linguistics. Having seen and studied the preprint of “Filters and Control,” Jean-Roger Vergnaud wrote to its authors. He had “some ideas to communicate.” Chomsky and Lasnik were unable to incorporate those ideas in their essay. Time was short; the mail, slow. They did something better. They incorporated them into their work.

The letter has become famous among linguists, outlasting, if not outliving, its author.

Jean-Roger Vergnaud died in Los Angeles in 2011.

Sight Unseen

Most readers are not likely to have seen this sentence before:

The dung ate the slug’s tail on the sum of 2 + 2.

It is not a sentence that suggests very much, but it is a grammatical English sentence. This is something that English speakers recognize at once, and recognize without effort. Sentences such as 1 may be embedded in still other sentences:

1. Solomon says that (the dung ate the slug’s tail on the sum of 2 + 2).
2. I heard (him say (that the dung ate the slug’s tail on the sum of 2 + 2)).
3. Readers realize (that I heard (him say (that…))).

If c is a grammatical English sentence, then why not

1. Ralph believes that (readers realize (that I heard (him say (that…))),

and so on ad infinitum? An allusion to infinity suggests an obvious question: how could infinitely many sentences be encompassed by the human brain, which, like the human liver, is blunt in its boundaries? In the first half of the twentieth century, Alonzo Church, Kurt Gödel, Emil Post, and Alan Turing created in the theory of recursive functions a mathematical scheme commensurate with the question’s intellectual dignity. The theory is one of the glories of twentieth-century mathematics. The factorial function n!, to take a simple example, is defined over the numbers n = 0, 1, 2, 3, …. Its domain and range are infinite. Two clauses are required to subordinate the infinite to finite control. The base case is defined outright: 0! = 1; and, thereafter, (n + 1)! = (n + 1)n! If the functions inherent in a natural language are recursive, the language that contains them comprises infinitely many sentences.

Sentences used in the ordinary give-and-take of things are, of course, limited in their length. Henry James could not have constructed a thousand-word sentence without writing it down or suffering a stroke. Nor is recursion needed to convey the shock of the new. Four plain-spoken words are quite enough: Please welcome President Trump. Prefacing 1d, on the other hand, with yet another iteration of Ralph believes, is no improvement on the original. Quite the contrary. It is a deprovement, like one hundred rounds of “For He’s a Jolly Good Fellow.” If sentences in English can be new without recursion, they can also be recursive without being new. The rules of grammar establish only that natural languages are infinite. Why they are as they are, no one knows. The same displacement of attention is at work in arithmetic. For anyone unaccountably persuaded that thirty-eight is the largest natural number, the rules of arithmetic say otherwise. The rules, note. The argument needs no further steps.

Native Speakers Speak

For years, American psychologists affirmed, on the basis of no evidence whatsoever, that children acquire their native language by an arduous process of discipline and training. B. F. Skinner had taught pigeons some simple skills and saw no reason that the same principles of stimulus conditioning could not explain how Chinese children acquired Mandarin.

He was mistaken.

Children acquire their native language without training, and what training they do receive is haphazard, degenerate, incomplete, or fragmentary.

Consider 2a and 2b:

1. You are happy.
2. Are you happy?

There is only one verb in a. It goes to the left in b. It is just possible to imagine that a child, having mastered a, could be brought to master b by reinforcement of some sort—a series of electrical shocks, perhaps.

But consider

1. Anyone who is interested can see me later.
2. *Is anyone who interested can see me later?

The strategy employed at 2b results in the verbal hash of 3b, a point marked by a Cyclopean asterisk. The correct question is

1. Can anyone who is interested see me later?

Children otherwise confused by the exigencies of the spoon never make the mistake in 3b. “Knowledge of language,” Chomsky and Lasnik remarked, “extends far beyond available experience.” On the level of the niceties, experts may prevail, as when the French Academy bans le week-end or le snack as Anglophone abominations; but a language belongs to its speakers, and it is their intuitions that determine what it can say and how it can say it.

Where else to turn; who else would know?

Chomsky has always been interested in the most elementary forms of grammar, and so in what is obvious enough to be overlooked.

1. *(I) love *(Lucy).
2. (Yo) amo *(a Lucy).
3. (Nik) maite dut (Lucy).

In English (a), one cannot drop subjects or objects; in Spanish (b), one can drop subjects but not objects; in Basque (c), one can drop them equally and good riddance to them both. The morphology of Nik, the Basque I, and Lucy, the American comedienne, remain encoded in the verb maite dut. In English, the only possible order of constituents is I love Lucy. In Basque, all permutations are possible. Every speaker of English, Spanish or Basque knows such facts. Asking native speakers for their judgment is an imperative of research. For generative grammarians, it would appear to be the only imperative. Field work? “That is a complete waste of your time and the government’s money,” the linguist Robert Lees wrote to some grant-seeking schnorrer. “You are a native speaker of English; in ten minutes you can produce more illustrations of any point in English grammar than you will find in many millions of words of random text.”

Native speakers retain their authority about native speech even if their judgments are less than categorical. He met his wife in Italy, natives say. He met in Italy his wife is off, although he met in Italy his wife of three days goes down better. In Lolita, Vladimir Nabokov may sometimes be seen seated on the sofa of these solecisms. Then again, if native speakers flag he met in Italy his wife, it is not as decisively as they flog in wife he Italy met. Some linguistic intuitions are all or nothing; others not.

With doctors, it is the same thing.

That cholesterol level of yours? Not so bad. Not so good either.

Great Goals of Fire

In his masterpiece of 1964, Aspects of the Theory of Syntax, Chomsky set three goals for linguistic theory. Two of them are trite. A linguistic theory must be observationally adequate, ruling out 3b and ruling in 3a. And it must be descriptively adequate, accounting for the properties of English or Japanese in terms of their grammars—the device that “gives a correct account of the linguistic intuition of the native speaker….” That native speaker is, au fait, no off the street knock-off. On the contrary. Chomsky’s native speaker is a one-man Platonic form. He is, Chomsky affirmed,

an ideal speaker-listener, in a completely homogeneous speech community, who knows a language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors in applying his knowledge of the language to actual performance.

Observational and descriptive adequacy, generously understood, have long been counted as goals of traditional linguistics. Vouchsafed the chance to read a preprint of Chomsky’s Syntactic Structures in the fourth century BC, the Sanskrit grammarian, Pānini, would have felt right at home. Explanatory adequacy is otherwise. The grammar of Greek is intended to explain Greek to the Greeks, even if, in the end, it is all Greek to those Greeks. Universal Grammar (UG) is intended to specify the most general principles of human language. It must provide an explanation for the extraordinary fact that a Japanese child raised in Paris will acquire French, but not Japanese, and a French child raised in Tokyo, Japanese, but not French. Either child may acquire both French and Japanese, of course, but neither will fail to acquire French or Japanese. Linguists and philosophers may have known this in antiquity; they did not say so with any great conviction, and they may not have said so at all. It was left to Chomsky to remark with the full force of his genius that every human language can be acquired by any human being. Universal Grammar, Chomsky concluded, must be a species-specific characteristic of the human race, biologically encoded, genetically transmitted.

In & Out

A descriptive grammar adequate to the demands of a particular language comprises a hideously complicated system of rules. A Grammar of Contemporary English, by Randolph Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik, runs to more than one thousand pages; and even at that length, readers may well conclude that the indefatigable Quirk, Greenbaum, Leech and Svartvik were just warming up. By the 1970s, it was becoming clear that no system of compromises could completely reconcile the rules of Japanese or Hungarian with the aims and claims of Universal Grammar. “There is a certain tension,” Chomsky and Lasnik wrote with some understatement, between these pursuits.

To attain explanatory adequacy, it is in general necessary to restrict the class of possible grammars, whereas the pursuit of descriptive adequacy often seems to require elaborating the mechanisms available and thus extending the class of possible grammars.

It makes no sense to assign to Universal Grammar the complex and often rebarbative grammatical rules of every human language. No one is born knowing the grammar of Mingrelian. The grammatical distinctions between even closely related languages are almost always sharp as swords. Both English and French are prepared to have Ernest get rid of some poor schlub named Bill, but in English, there is the simple sane syntax of

Ernest wants Bill to go,

while in French, the subjunctive is needed, as in

Ernest veut que Bill s’en aille,

Ernest veut que Bill parte.

A word-by-word translation of 5 yields only

*Ernest veut Bill aller,

which, although comprehensible, is pourri jusqu’à la moëlle, as fastidious French snoots might say. It is bad to the bone. (Google translates 5 as Ernest veut le projet de loi pour aller, thus suggesting that, Steven Hawking and Elon Musk notwithstanding, anxieties about artificial intelligence are somewhat premature.) Children learning English acquire 5; learning French, they acquire 6 and 7. An acquisition suggests something they might have just picked up, but not knowing the grammar of either language, just how did they pick up anything at all?

“The history of transformational generative grammar,” Mark Baltin observed, “can be divided into two periods, which can be called expansion and retrenchment.” This has given contemporary linguistics a very characteristic breathing-in and breathing-out structure. “During the early expansion period, a primary concern was the description of grammatical phenomena.”

Breathing in.

Explanatory theory “was correspondingly loose …”

Breathing out.

During the retrenchment period … the focus of attention shifted from the construction of relatively complex … statements to the construction of a general theory of grammar, restricted as to the devices it employed, which could be ascribed to universal grammar.

Breathing in.

Whatever the balance between complicated and quite specific rule systems and Universal Grammar, it was clear by the 1970s that one of them would have to take precedence over the other.

The Courtiers Gather

In 1975, Massimo Piattelli-Palmarini organized an encounter between Jean Piaget and Noam Chomsky at the Royaumont Abbey, some twenty miles or so north of Paris. In an essay entitled “Encounter at Royaumont,” Howard Gardner recalled the ingathering of courtiers at what was formerly an austere Cistercian monastery. The grace of God prevented Piattelli-Palmarini from organizing the conference at a Carthusian monastery, where vows of silence would have prevailed. In attendance, Gardner wrote, were

Nobel laureates in biology, leading figures in philosophy and mathematics, and several of the most prominent behavior scientists … It was almost as if two of the great figures of the seventeenth century—Descartes and Locke, say—could have defied time and space to engage in a joint meeting of the Royal Society and the Académie Française.

On s’imagine cela. Piaget had long been eager to participate in such a discussion; Chomsky, less so. But in the end, “he accepted the invitation proffered by the late Jacques Monod.” Gardner emphasized how the meeting influenced “the future awarding of research funds, the interests of the brightest young scholars, and, indeed, the course of subsequent investigations of human cognition.”

Events followed a familiar academic trajectory:

Piaget noted “all the essential points in this about which I think I agree with Chomsky.” And Chomsky acknowledged “Piaget’s interesting remarks.” As the discussion proceeded and became increasingly heated, the tone became distinctly less friendly. Piaget criticized the nativist position as “weak” and “useless,” even as Chomsky described certain Piagetian assertions as “false,” “inconceivable,” and (in a mathematical sense), “trivial.”

By common consent, Chomsky impressed biologists otherwise well-disposed to Piaget with the force of his arguments and the precision and pertinence of his examples. When he wearied of combat, Chomsky ceded the floor to his Minister of War, Jerry Fodor, who succeeded in further flabbergasting the biologists by insisting that his most trivial remarks had the structure of a logical proof.

In fact, Gardner added, the keynote for the conference at Royaumont was set by the cybernetician, Guy Cellérier, who compared the development of the mind to climbing a hill. Cellérier did not need to add that by climbing a hill, he meant climbing up the hill, an interesting example of the tacit knowledge to which ordinary speakers appeal the minute they open their mouths.

And the Paradigm Shifts

Royaumont was notable in marking the beginning of what Chomsky would later call the bio-linguistic paradigm. Long before Royaumont, Chomsky had argued that the acquisition of a language in childhood represents nothing less than the maturation of a biological system; long after Royaumont, he argued that he had been right long before Royaumont. “Assuming that language has general properties of other biological systems,” Chomsky wrote in 2007,

we should be seeking three factors that enter into its growth in the individual: (i) genetic factors, the topic of UG, (ii) experience, which permits variation within a fairly narrow range, and (iii) principles not specific to language. The third factor includes principles of efficient computation, which would be expected to be of particular significance for systems such as language.

These are, in their largest aspects, principles that govern the development of the visual system or the maturational progression into puberty. No one learns to see in three dimensions or to interpret an arrow in flight as a figure moving against an unchanging background. Children see what they see and grow as they do, and at the age of thirteen or so, the boys, at least, lose their elfin graces and enter into the semi-adult world of bullfrog-like voices and ripe pustules.

Some biologists welcomed Chomsky’s discovery that he was, deep down, a biologist with a marked lack of enthusiasm. Skinner had long insisted that behaviorism was nothing more than a local form of Darwinian evolution; he concluded that behaviorism must be correct in virtue of its reflected glory. That the argument might go in reverse, like leverage in the commodities market, did not occur to him. In his review of Skinner’s Verbal Behavior, Chomsky emptied Skinner’s reputation of its brimming content, and on those occasions in which he had talked or written about the evolution of the language faculty, he seemed to suggest that since nothing was known, anything could be said. This came perilously close to a kind of contemptuous indifference to Darwinian doctrinal affiliations.

However far it might have been from biology itself, the bio-linguistic perspective did suggest a strategy by which universal and particular grammars could be seen as aspects of a single system. The principles of Universal Grammar were assigned a regulatory role in the governance of every human language. Some were so obvious as to have gone unmentioned for thousands of years. Latin grammarians certainly knew that Latin is constructed from a finite number of words. It has a distinctive atomic structure. So does every human language. The grammarians failed only to observe, or to remark, that there is no obvious reason why this should be so. Other topologies are possible. Giraud’s Theorem describes an association between a first-order theory and a Grothendieck topos, one that goes from the austerities of the theory’s logical structure to its meaning.

No natural language goes there or does that.

The atomic structure of language belongs to the familiar category of facts that seem to have been well known without ever having been widely remarked. The A over A principle is otherwise. It was not known at all until Chomsky presented it to an audience of uncomprehending linguists in 1962. They had never heard of such a thing. If a rule ambiguously applies to some element A in a structure of the form … (_A …(_A… )), the rule must apply to the largest (or the longest) bracketed A-like constituent before it applies to A. Thus (_A …(_A… )) over (_A… ).

There are two relevant noun phrases (NP) in

I won’t forget (_NP my promise to (_NP that idiot Washburn)),

but the rule governing which of them may be extracted stops

*That idiot Washburn, I won’t forget my promise to,

dead in its tracks, while waving a white baton at

My promise to that idiot Washburn, I won’t forget.

Whatever the principles of UG, counting, curiously enough, is not among them. Human linguistic and arithmetic abilities seem to belong to systems maintaining only the most distant of diplomatic relations. In a Chinese fragment from 200BC, a student asks his teacher whether he should spend more time learning speech or numbers. His teacher replies: “If my good sir cannot fathom both at once, then abandon speech and fathom numbers, (for) numbers can speak, (but) speech cannot number.” No grammatical rule involves counting words, because no grammatical rule involves counting anything. Even the linear order of a natural language, in which one word comes after another, is a concession to the limitations imposed on speech by a single channel of communication. Were human beings able simultaneously to speak through their mouths and snort through their noses, the demands of linearity might well be relaxed.

Hear, Hear

Throughout the nineteen-sixties, many working linguists paid lip service to UG, but when it came to making sense of the devilishly complicated structure of the English language, their proposals were tame to the point of triteness. More rules, better rules, rules without limit, and so rules without end. Robert Lees’s dissertation, “The Grammar of English Nominalizations,” was published in 1960, and under his tense pre-word processing thumbs, English nominalizations appeared to be governed by as many rules as the Halakha.

At some time in the 1970s, it became clear to linguists that more rules, like more gravy, was an injunction subject to the law of diminishing marginal utility. In their paper about filters and control, Chomsky and Lasnik codified with increasing confidence a radically disjunctive view of linguistic theory. Beyond its universal principles, UG contained a system of open binary parameters. The universal principles were true of all languages; but “an actual language,” Chomsky and Lasnik wrote, “is determined by fixing the parameters of (the) core grammar.”

This idea was very much in the air. So many things are. A line of influence ran from microbiology to generative grammar. Chomsky had been deeply impressed by the operon model of the bacterial cell—the work of Jacques Monod and François Jacob. In his Nobel Prize address, Jacob provided a long look back:

We can therefore envision the activity of the genome of E. coli as follows. The expression of the genetic material requires a continuous flow of unstable messengers which dictate to the ribosomal machinery the specificity of the proteins to be made. The genetic material consists of operons containing one or more genes, each operon giving rise to one messenger. The production of messenger by the operon is, in one way or another, inhibited by regulatory loops composed of three elements: regulatory gene, repressor, operator. Specific metabolites intervene at the level of these loops to play their role as signals: in inducible systems, to inactivate the repressor and hence allow production of messenger and ultimately of proteins; in repressible systems, to activate the repressor, and hence inhibit production of messenger and of proteins. According to this scheme, only a fraction of the genes of the cell can be expressed at any moment, while the others remain repressed. (emphasis added) The network of specific, genetically determined circuits selects at any given time the segments of DNA that are to be transcribed into messenger and consequently translated into proteins, as a function of the chemical signals coming from the cytoplasm and from the environment.

The success of these ideas in prokaryotic populations prompted both Monod and Jacob to generalize them to encompass the eukaryotes. “What accounts for the difference between a butterfly and a lion, a chicken and a fly, or a worm and a whale,” Jacob declared, “is not their chemical components, but varying distributions of these components.” The claim has become famous. Whether it is true is another matter entirely. One could with equal justice say that what accounts for the difference between the great pyramid at Giza and the Large Hadron Collider in Lausanne is a matter merely of the varying distribution of their components.

For all that, it is possible to see in Jacob’s remarks the emerging outlines of the principles and parameters (P&P) approach to linguistic theory. English is a heads-up language. The picture is hanging on the wall. In Japanese, it is the other way around. E wa kabe ni kakatte imasu. (The) picture wall on is hanging. Japanese is a heads-down language. Once a parameter has been set, its influence ramifies throughout the grammar of the set-upon language.

The question how human languages can be fundamentally the same if they are so very different invites the peremptory, but premature, response that if they are so very different, they could not be fundamentally the same. Not so. Two languages may be alike because both languages respect the principles of UG; and unalike because they vary across a finite number of binary parameters. Languages as far apart as Mohawk and English, Mark Baker argued in The Atoms of Language, are separated by only a handful of parameters. When the parameters of Mohawk are changed in favor of their English-language settings, it becomes clear that Mohawk speakers intended to speak English all along. And vice versa, of course. Anthropologists have made similar arguments about human nature. It is everywhere the same except in matters of sexual discretion, taste, fashion, coloring, clothing, and the way in which to greet the rising or the setting sun. It was on encountering the Nambikwara that Claude Lévi-Strauss realized that this was so, and assigned an improvement in his humility to the experience. The handful of differential parameters by which men are separated are more noticeable than the great universal principles by which men come to recognize one another as men—but they are less important.

Languages, too. It is the same.

As all suns smolder in a single sun, the word is many but the word is one.

Straight Outta COMPton

Introduced in 1967 by the linguist, Peter Rosenbaum, a complementizer, or COMP, is, on generative principles, a part of speech, and like NOUN, or VERB, entitled to the majestic upsweep of capitalization, the brand mark of boldface. Before tensed sentences, COMP figures as that

Trottweiller believes that silence is golden;

before infinitives, as for

Trottweiler prefers for Agnes to keep quiet.

Whether, if, whither and whereupon are among the COMPS; and COMP as a category may be empty, too; that and for have both been disappeared from 14

I think Trottweiler prefers Agnes to keep quiet.

In her MIT PhD dissertation, “Theory of Complementation in English Syntax,” Joan Bresnan argued that COMP should get its own category, Ś. Generative grammarians had until that very moment thought of S (Sentence) as the highest of categories (Größte Kategorie aller Zeit, as German linguists like to say). Unlike traditional grammarians and schoolteachers, who had diagrammed sentences in terms of subjects, verbs, and objects, generative grammarians argued that sentences were strictly two-man jobs:

S → NP + VP,

where NP is a noun phrase, and VP, a verb phrase. But whether S is a two or a three-man outfit, on Bresnan’s view, it has a back-up in Ś:

Ś → COMP S.

Although COMP is a category comprised of the most ordinary of words—that, for, if, after all—it has played an outsized role in the ongoing drama of generative grammar.

Anarchy & Order

Well before Chomsky, Leonard Bloomfield had argued that the “the lexicon of a natural language is basically an appendix of its grammar, a list of its basic irregularities.” This distinction between the orderliness of a grammar and the anarchy of its lexicon, generative grammarians carried over intact. They were happy to do so. The lexicon of a natural language is not really a dictionary. It does not define a cow as Animal quadrupes ruminians cornutum, as Samuel Johnson remarked in observing that definitions often make things darker. A lexicon is closer to a chrestomathy—of words and idioms, obviously, but of morphemes, too, when necessary. Lexical items are identified by their features, the lexical “dog” listed as (+ N), (+ ‘dɒg), (+ count), (+ animate), (– artifact), (– stative), (+ slobbering) … It is in the lexicon that one sees naked the primitive connection between sound and meaning. There is a dog in the English lexicon, un chien in the French, ein Hund in the German, and there is no better reason that this should be so beyond the fact that it is so.

Grammatical rules do not reach down to touch the anarchy of such facts. This is entirely compatible with the hypothesis that, morphological differences aside, human beings share a single lexicon, so that words, like electrons in quantum field theory, are all essentially identical.

The classification of lexical items in terms of their binary features carries over to grammatical categories. Languages have nouns, verbs, adjectives, and prepositions. Chomsky presented an analysis of these distinctions in terms of two binary features: +/- N and +/- V. A noun is all N and no V. A verb is all V and no N. An adjective is both + N and + V; but a preposition is neither and so figures as a grammatical eunuch. “We might just as well eliminate the distinction between feature and category,” Chomsky remarked, “and regard all symbols of the grammar as sets of features.”

A four-fold scheme is the result:

	+ N	– N
+ V	Adjective	Verb
– V	Noun	Preposition

The old-fashioned analytical apparatus, by which nouns were regarded as the names of persons, places or things, and verbs were thought somehow to designate actions, has been given up for good. Nothing much is left of the ancient Aristotelian categories either. A similar movement has taken place in biology, as evolutionary biologists have come to realize that, like the parts of speech in generative grammar, species are nothing more than ever-shifting sets of features. Willi Hennig published his masterpiece, Grundzüge einer Theorie der phylogenetischen Systematik (The Foundations of a Theory of Phylogenetic Systematics), just seven years before Noam Chomsky published Syntactic Structures. It took biologists twenty years, or more, to understand what he had done.

The irregularity of a natural language having been reduced, and confined, to its lexicon, its orderliness is expressed by its rules. In “Filters and Control,” Chomsky and Lasnik expressed themselves satisfied with what linguists had come to call the Extended Standard Theory, or EST. Most rules are context free. They have in

α → … β… a common form. To the left of this scheme, a single symbol, α; in the middle, an arrow indicating that the single symbol must be rewritten; and to the right, the rewritten result, the insulating down of three dots serving to show that rewriting conveys one symbol to any number of them.

The EST contains an obvious rule by which a sentence may be rewritten as a noun phrase and a verb phrase:

S → NP + VP.

But a verb phrase, the EST at once sings out, may also be rewritten as a verb together with a complementizer:

VP → V + Ś.

This introduces a recursive loop

Ś → COMP S,

the initial S in 17 now reappearing in 19.

Phrase structure rules give rise to base phrase markers, 19 leaving a structural residue in 20:

(_SNP (_V V (_PPP (_NPD N)))).

After lexical insertion, 20, but not poor Luca, comes vividly to life in

Luca Brasi sleeps with the fishes.

Recursion serves to promote 21 to

Tessio said that Luca Brasi sleeps with the fishes,

a process sine fine, as Latin rhetoricians would say, an incidental question from an inattentive mobster—

What did Clemenza just ask?

sooner or later encompassing all of the Corleones.

Clemenza asked whether Tessio said that Luca Brazzi sleeps with the fishes.

But phrase structure rules do only so much. The EST required still other grammatical operations to accommodate an undertaking so simple as raising a question.

The conveyance from

Luca Brasi is sleeping with the fishes

Who is Luca Brasi sleeping with?

or even

With whom is Luca Brasi sleeping?

—the question of a gangster with a taste for fancy diction—cannot be achieved by phrase structure rules without a loss of generality. This was the burden of Chomsky’s Syntactic Structures. Some linguists and philosophers were skeptical. Surely it is possible to come up with rules yielding simple phrase-markers for 25 or 26? A certain amount of effort was devoted to constructing transformational grammars without transformations, an undertaking that, in retrospect, suggests the correlative ambition to construct airplanes without lift. The problem lies in the phrasal descent from (sleeping with the fishes) to (sleeping with) to (sleeping). The meaning of the verb does not change as a question is being asked. It is the very same verb throughout—sleeping with x. What is variable is only which x anyone is sleeping with, an observation commonly made about domestic affairs as well as grammar. This is just what those cascading phrase markers fail to reveal.

The generation of a question involves a complex mapping between phrase structures. Transformations are required and this is a context sensitive process. Functioning rather like hash functions, transformations take complex arrays into complex arrays and so elaborate phrasal contexts into elaborate phrasal contexts:

(Luca Brasi is sleeping with the fishes) → (With whom is Luca Brasi sleeping)?

Ah yes, the fishes.

COMP Constructions

In her influential PhD dissertation, Joan Bresnan had noticed that COMP constructions are remarkably labile in English. Before stand-alone sentences, COMP deletion is obligatory.

*That Rome was not built in a day,

hangs in mid-air, a COMP deletion away from the trite thought that Rome was not built in a day. So does

*Whether Stearasil starves pimples,

another uneasy mid-air survivor of a pending COMP deletion.

Before stand-aside sentences, on the other hand, COMP deletion goes either way. COMP contraction reduces

I don’t think that Stearasil starves pimples,

I don’t think Stearasil starves pimples.

But in still other sentences, COMP deletion goes bad at once.

*Trottweiler is a lunatic was obvious from his speech,

is no good as anything more than an anomalous sputter; but

That Trottweiler is a lunatic was obvious from his speech,

is a fine, manly objurgation.

On encountering some tedious privilege-checker,

For you to keep checking your privilege is becoming tiresome,

is both satisfying as a rebuke and correct as a sentence; but not so

*You to keep checking your privilege is becoming tiresome.

Simply restoring COMP to 31 and 34 returns them to sentential dignity.

Given the many occasions in which it might be useful to get rid of COMP, Chomsky and Lasnik needed to address the question whether to delete COMP constructions across the board, or to take on the job one COMP at a time. “The conditions under which such deletion rules could apply,” Henk van Riemsdijk remarked,

were, of course, originally stated in the structural descriptions of each individual deletion transformation. But here as well, a generalized theory was felt to be preferable. Ideally, such a theory would amount to the claim that there is one generalized deletion rule, “delete α,” which would be subject to a set of powerful constraints that would prevent massive over generation and ensure proper application in specific languages. Chomsky and Lasnik (1977) was an important step in that direction. That deletion, for-deletion, and wh– deletion were abandoned and replaced by a rule of free deletion in COMP.

A principle of free COMP deletion, Chomsky and Lasnik decided, should be one of the rules of core grammar.

In the domain COMP, delete (_α φ), where α is an arbitrary category and φ an arbitrary structure.

Get rid of COMP ad libitum, as physicians say, often to their regret.

Linguists, too.

The Modern Conveniences

The rule of free COMP deletion has in its favor the fact that the alternative is worse. Without free COMP deletions, rules would require ordering. If there are two rules in the grammar’s core such that one must be applied before the other, there are, all at once, three rules, a nuisance for linguist and language learner alike—the original two rules, and the rule stating which one of them comes first. This sort of thing can quickly get out of hand, especially when imperative and reflexive constructions are mutually engaged. If wash yourself represents the reflexive and imperative rules by which you wash you goes over to you wash yourself, and thereafter to wash yourself, the other way around would have the imperative apply directly to you wash you, yielding *wash you! Once the subject is obliterated by the imperative, the reflexive rule lacks a correct structural description to which it can apply.

How screw you! emerged from various competing grammatical claims is not well-understood.

In Chomsky’s Master’s thesis on modern Hebrew morphophonemics, rule ordering went down to the twenty-fifth level. By the time that Chomsky and Lasnik came to write about filters, they realized that children learning English, having mastered the edifice of English grammatical rules, would also have had to master their proper order of application.

An orthodox Jew, it is often observed, has no time to be anything other than an orthodox Jew.

Native English speakers under a regime of rule ordering did not seem much better off.

Free COMP deletion thus has in its favor all of the modern conveniences. On the other side of this particular ledger, there is the fact that, just as Robert Lees suggested, English speakers can come up with a dozen challenges to free COMP deletion:

1. It bothers me *(for) (Bill to win).
2. It is illegal *(for) (Bill to take part).
3. It is preferred *(for) (Bill to take part).
4. I want very much *(for) (Bill to win).
5. He argued passionately *(for) (Bill to be given a chance).
6. There is someone at the door *(for) (you to play with).
7. I received a book on Tuesday *(for) (you to read).
8. *(For) (John to take the job) would be preferred.
9. *(For) (John to be successful) would be unlikely.

In each of these examples, COMP deletion has overshot its intended mark. I want very much Bill to win is, for example, a mistake typically made by non-native Mandarin speakers of English.

“We hope to preserve the very simple and general rule,” Chomsky and Lasnik wrote, “that elements in COMP may freely delete, as a rule of core grammar.”

No one could object to the hope.

Filters

Within the context of “Filters and Control,” it is the filters that are intended to choke off the gibberish that the drain of free COMP deletion would otherwise let through. The idea had its origin in David Perlmutter’s 1968 dissertation, “Deep and and Surface Constraints in Syntax.” Chomsky and Lasnik promoted filters to theoretical status. The filters, they argued,

will have to bear the burden of accounting for constraints which, in the earlier and far richer theory, were expressed in statements of ordering and obligatoriness, as well as contextual dependencies that cannot be formulated in the narrower framework of core grammar.

There were eleven filters in all. The (for-to) filter excluded

*The Cardinal was planning for to go to Rome,

but left open the possibility that

Bobby Joe was planning for to marry his sister,

might be an Ozark dialect. Other filters ruled out double COMP constructions in which two comped elements appear side by side:

* Lothario is the man who that came,

a construction, as Dutch linguists promptly observed, that appears quite naturally in Dutch.

Constructions in which a lexical noun phrase finds itself directly attached to an infinitive—these were of special concern. Sampson to bring down the house; or, more generally, (_α NP to VP). Linguists had long known that the English infinitive is often in conflict with its ostensible subject.

*It is unclear what the late Slobodan Milošević to do,

is a sentence that only the late Slobodan Milošević could have loved, even though only two letters separate 40 from the unoffending

It is unclear what the late Slobodan Milošević is to do.

Given the conflicted concourse between noun phrases and their ancillary infinitives, the (_α NP to VP) filter was intended for disciplined regulatory work:

* (_α NP to VP), unless α is adjacent to and in the domain of Verb or for.

Although permitted by the rule of free COMP deletion in core grammar, unwanted examples are flagged down later, when, like the rest of the core constructions, they come up to the surface. For all that, 42 has an undeserved air of terminological mystery, and at a first reading, it might seem that the asterisk marking unacceptability and the word “unless” are somehow in conflict, like two lifeguards determined to rescue one victim. The confusion is needless. The Chomsky–Lasnik filter resembles the declaration, seen often in old-fashioned burlesque houses and movie theaters, that no minors are allowed unless accompanied by an adult. A trip of three steps is involved.

I want very much for Bill to win,

is sanctioned in core grammar by the grammar’s phrase structure rules.

I want very much Bill to win,

is, in turn, justified by the rule of free COMP deletion. By the time that 44 makes it to the surface, ready either to be spoken out loud or handed over to the logical system, the Chomsky–Lasnik filter restores it to the common decencies of a grammatical sentence by blocking COMP deletion:

I want very much for Bill to win.

On the other hand,

John believes Mary to be brilliant,

and

For Mary to be so brilliant, she had to work hard at it,

sail right through. “Mary” and the verb “believes” are side by side in 46; and “Mary” comes right after “for” in 47.

The argument has now acquired a distinctive four-part shape. The rules of the grammar’s core sanction indifferently any combination of a lexical noun phrase and an infinitive: (NP to VP). They sanction as well any embedding of (NP to VP) into a still larger COMP context: (_compNP to VP). They permit, in the third place, the free deletion of COMP in this context—any COMP, any time. And, finally, filters are provided to handle the overflow into ungrammaticality; the representations that make it through the filters make it through them as grammatical sentences.

Vergnaud’s Letter

No one would think to say that the EST was wonderfully elegant. It might even seem—not to us, of course—as if the system’s filters were an adventitious afterthought, something that grammarians added to the system to tie up a few loose ends. A superficial look at a modern internal combustion engine often conveys the same impression.

Well, what you got, you got lifter tick on account of the fact that you got a bent push rod. Which one of you figures he’s Juan Manuel Fangio, by the way?

Grammatical filters were destined to do useful work; and, like hydraulic lifter rods, they appeared ineluctable. A grammatical theory cannot easily do without the first; and the internal combustion engine could not easily do without the second. What gave pause in in the EST was not what the theory contained, but what it lacked. The Chomsky–Lasnik (_α NP to VP) filter served the ends of descriptive adequacy; it served those ends by abbreviating any number of anomalies of the sort in evidence at 36. There was no particular need to examine them on a case by case basis. The filter served to execute a full sweep. Bent on marriage, Bobby Jo remains where the (for-to) filter left him: midway between violating a rule of grammar and outraging a taboo. Double COMP constructions were left to the Dutch. What remained unaccommodated by the Chomsky–Lasnik filters was the demand for explanatory adequacy. What deep concept tied them together and served to show that the prohibitions that they enforced—no to I want very much Bill to win, no to the late Slobodan Milošević to do, no, in thunder, to Bobby Jo—were a part of unified systems of prohibitions, things that were contrary to law malum per se?

It was this need that Vergnaud’s letter met. His letter had some of the effect commonly assigned to heat lightning. It lit up the scene. There should be a name for events of this sort. Perhaps the German Gedankenblitz will do.

In 1956, Francis Crick discovered transfer RNA by what amounted to a transcendental deduction. Given the chemical discrepancy between the nucleic acids and the proteins, something must mediate between them. He was entirely correct. He had been guided to this conclusion by nothing more than his uncanny intuition.

Gedankenblitz.

In the same year, Kurt Gödel wrote a now-famous letter to John von Neumann, in which, after sensitively wishing the stricken von Neumann an improvement in his health, Gödel posed the question whether P = NP?

Gedankenblitz again.

Vergnaud’s recovery of case belongs in this distinguished class. Whether case is morphologically specified, as in Latin, or abstract, as in English—Case Majeure, as linguists say—are matters on the surface of things, where language is largely froth. Deep down, case is compelling because linguistics has become a part of the Galilean undertaking, a way of explaining what is visible by an appeal to what is not. Chomsky and Lasnik knew that this would become so; but in Vergnaud’s letter, they could see that it was becoming so.

This is no small thing.

Cold Case

The boy loves the girl. Puer puellam amat. But equally
Puellam puer amat, which is again the boy loves the girl. Latin nouns and pronouns are all entombed in the closed coffin of their case. There are seven cases in the singular: nominative, accusative, dative, genitive, ablative, vocative, and locative. They are all morphologically marked, something that Latin-speaking children once picked up with ease, and later little Latinists picked up only by memorizing those grim endings, one after the other. The Latin plural requires seven additional endings. Beyond a few case-like relics—who, whom—English is not inflected for case.

Yet Jean-Roger Vergnaud, in his letter to Chomsky and Lasnik, saw a way of simplifying their system on the assumption that English, too, had a form of case. “Here’s what I have in mind,” he began. “I believe that this filter (the (_α NP to VP) filter) could be replaced by a filter that governs the distribution of certain NP’s.” To accommodate this idea, Vergnaud proposed that the English language, against all appearances to the contrary, possessed a three-part case structure:

The subject case is the case of subjects in tensed clauses. The sentence that the subject case is the case of subjects in tensed clauses illustrates itself.

The genitive case is the case of Mary’s book, hers, yours, mine, and the honorary genitive, etc.

The governed case is the case of verbal and prepositional complements, as in Mary saw him, Mary gave him a book, Mary talked to him, a book by him.

If three English cases are now on the mortician’s table, they are certainly not anywhere much in evidence in what a native English speaker might say. “Case inflectional morphology,” Vergnaud cheerfully admitted, “is quite poor, of course.” Cases do linger in the English system of pronouns: I, me, mine, you, yours, and all the rest of the standalones, shut-ins, and stand-ins, but in comparison to the fantastic abundance of Latin inflections, the English pronouns represent only the shrunk shank of a morphological system that was case-heavy more than eight hundred years ago.

The English cases to which Vergnaud appealed are theoretical entities; and since they are invisible, they must be inferred. Physicists understand inferences of this sort at once. Why else would they talk of spin with respect to entities that do not spin and are not entities? Or countenance those exquisite Faddeev–Popov ghosts that flit into existence and then flit out again—as do we all?

“A characteristic property of infinitival constructions,” Vergnaud argued, “is that, in such constructions, the subject is in the Governed Case.” This is a fine insight, and not one that Latin linguists would have made. They were persuaded that the nominative must betoken the subject. Not so. A noun phrase can be displaced in Latin, as in Caesar occiditur, where “Caesar” is in the nominative despite the fact that it is the logical object of the verb. He was murdered, after all.

English has something similar. Witness Vergnaud’s examples:

1. We’d prefer for him to leave.
2. It is illegal for him to leave.
3. We found a man for him to speak to.
4. For him to leave would be unfortunate.

All of them were brilliantly chosen because, in the he-him distinction, English retains an ancient morphological case marker. The noun phrase is in the governed, or even the accusative case, but the noun phrase carrying the case is serving as the subject of the infinitive that follows.

Assumptions now begin to multiply, but with exhilarating force. “Well, I shall hypothesize,” Vergnaud wrote, “that the distribution of infinitival constructions of the form NP to VP follows from the distribution of NPs in the Governed Case.” An otherwise invisible case is now given control of an otherwise problematic construction.

“Specifically, let’s posit,” Vergnaud argued, that

A structure of the form …(α…NP…)…, where NP is in the Governed Case and α is the first branching node above NP, is ungrammatical unless (i) α is the domain of (– N) or (ii) α is adjacent to and in the domain of (– N).

Two years later, Chomsky expressed 49 as a principle:

*NP if NP has phonetic content and has no Case.

There are three points that are not immediately obvious in 49 and 50.

The first: that (–N) may have attributes of a verb or a preposition, and either may sanction case in its domain. In its negative incarnation, (–N) functions as a case assigner; made positive as a noun or adjective, as a case receiver.

The second: that whatever the noun phrase, if it is speakeable then it must have case.

The third: that infinitives must have a subject; and if no speakable subject may be found, then the otherwise silent PRO must go where lexical NP’s dare not tread. The grammatically impeccable

Susan tried to solve the problem,

represents the phonetic residue in real life of

Susan tried (PRO to solve the problem),

and not anything like

*Susan tried (John to solve the problem).

John is clueless in 53 because tried has left him caseless. Without a case-marked noun phrase, the infinitive has no subject, and simply hangs in space—whence the demand that the case filter encompass noun phrases with a voice of their own.

The burden of Vergnaud’s elegant argument is that case is obligatory, even in English. If English cases are not directly reflected in their morphology, as plainly they are not, their assignment is surely not arbitrary. The accusative case assigned Caesar in Caesar to cross the Rubicon is determined either by some antecedent verb or by some locally loitering preposition. “That is to say,” as Jonathan David Bobaljik and Susi Wurmbrand observe, “verbs and prepositions have the distinctive characteristic of being (accusative) case assigners, and thus the disjunctive environment stipulated in Vergnaud’s unless’ clause is none other than the domain of accusative case assignment.”

While verbs and prepositions perform this function, how they perform it is rather less clear. Some barely sensed prohibition of action at a distance is at work throughout. Case assignment is a local operation, one constituent influencing another more or less directly. The world would have been a relatively simpler place if elements in (–N) assigned case only to sisterly constituents. This does happen: there is cross him at your peril or go easy on him. But it is easy to see that a comped preposition is generally not the sisterly constituent of the subject to which it assigns case. In

For (John to solve the problem), …

the COMP for is a sister to the entire sentence.

If case is not assigned under simple sisterhood, or at an arbitrary distance, what are the structural conditions under which it is applied? This is no easy question. Having relieved Sampson of his hair along with his manhood, there is no grammatical way in which

*Delilah decided Sampson to bring down the house.

The verb decide cannot cross the non-finite clause (Sampson to bring down the house) to assign Sampson any case at all.

But

Delilah believed Sampson to be better off bald,

makes perfect grammatical sense, evidence that while “decide” remains stuck at non-finite frontiers, “believes” crosses over them with no passport at all.

“The Case Filter provides an account of this contrast,” Bobaljik and Wurmbrand remark, “if (emphasis added) what is special about the believe class is that they permit Case assignment across a non-finite clause boundary.”

Little if, big if, native Mohawk speakers say.

Case in Point

Vergnaud’s case filter represented an impressive achievement in unification, bringing Chomsky and Lasnik’s various and vagrant filters under the umbrella of a single governing concept. No matter the dialect current in the Ozarks, (for-to) constructions are stricken from the official record by case considerations. The anomalies at 36, once handled by the Chomsky–Lasnik (NP to VP) filter, are now handled redemptively by Vergnaud’s case filter. After this has been seen, it is easy enough to see.

Case enters into linguistics as both a parameter and a principle. The case parameter is by default off. There is nothing to notice in Chinese. Off is off. Heard in the squawking atmosphere of childhood, Finnish is different. There is something to notice. On goes on. One expects Finnish toddlers, not having yet mastered the case morphology of their native language, to start practicing relatively fixed phrasal orders.

They do not have a minute to lose.

English toddlers, by way of a contrast in case, can take it easy. They are at their ease. English is not for this reason easier than Finnish. It is different. What it lacks in case morphology, it gains in a rich and complex prepositional system: put up, put in, put down, put out, put by, put with, put aside, so many inscrutable variants of putt putt to the put-upon, iron-eared, and tongue-tied Finnish student of English.

Abstract Case lies beyond the reach of morphology, and unlike common case, appears upswept by a capital letter. Nothing is On nor Off. Chinese was long thought case-less, even by Chinese linguists. This is true only to the extent that Chinese fails to reflect its cases in its morphology, one reason that classical Chinese poetry is so very difficult to translate into English. It was Audrey Li who demonstrated the persistence of Case in Chinese. Before the publication of Abstract Case in Mandarin Chinese, few linguists believed that case figured in Chinese at all; after its publication, the burden of proof shifted. If case is a parameter in practice, Case is a principle in theory. Noun phrases require Case in all languages. This, Vergnaud observed, is no very ordinary injunction. It is not a something that just happens to be true. It is a principle of Universal Grammar. It is of the essence. Vergnaud knew perfectly well that Greek and Latin were case-heavy, French and English, case-lite. He had nevertheless argued in defiance of the facts to a profound universal conclusion: Case is a feature of language itself. In this, he had little more than intuition and taste to guide him.

His reasoning was exquisite because, in so many respects, it was not reasoning at all.

Case and case have now become entrenched within modern linguistic theory, so much so that various displacement operations could not be stated without them. The construction of the passive voice is a case in point. When the Bible reports that Cain killed Abel, it places Cain in the nominative and Abel in the accusative case. If Cain killed Abel, then obviously, Abel was killed by Cain. With this analysis, traditional grammarians may be observed leaving the room well-satisfied. Modern grammarians are otherwise. No one has figured out a way in which to say goodbye to them. The introduction of the past participle, with its distinctive morphology, triggers a series of subtle grammatical twitches in core grammar. The past participle killed requires a noun phrase on its right. That Cain Abel killed is not English. Abel must stand where he was, in English, and, alas, in life. The first twitch leading to the passive occurs when killed is assigned both its object in Abel and some undetermined subject:

Something was killed Abel,

e was killed Abel,

where e designates an empty category.

Under ordinary circumstances, verbs have the power to assign case. In

Cain killed Abel,

the verb killed assigns to Abel grammatical standing in the morphologically unmarked accusative case.

But, in 58, was killed has lost its animating powers of case assignment. This is the decisive feature of the passive voice. If Abel cannot be assigned case, the Case Filter rules the sentences ungrammatical. Questions of case now lead to a dilemma: either the passive voice is ungrammatical or case considerations must force the errant noun phrase Abel to scoot over the sentence to the only slot in which it can receive case at all. That is the slot marked by the empty category e.

Whereupon there is

Abel was killed.

This analysis has remained virtually unchanged within the minimalist program, where case plays a central role in terms of activation conditions for transformations: a noun phrase whose case is checked via an agreement process (obvious in many Romance languages) becomes inaccessible to further transformations. Case comes to the fore within the theory of transformations in a way that Vergnaud did not anticipate and could not have seen. No one thinking about the passive voice would very easily conjecture that the necessary movements by which it is made possible have anything obvious to do with case.

Yes it is so.

Progress in linguistics should not be assigned the aspect of an intellectual deliverance. Beyond the expanding circumference of light, there remains the enveloping area of darkness. Vergnaud understood that two fundamental issues of language remain unexplained: how children acquire languages rapidly and without effort, and how human beings acquired the species-specific characteristic of their race. These problems are inferior only to questions about the origins of life in their difficulty. Some progress has been made in explaining how children do what so plainly they do. They acquire their native language in virtue of their universal inheritance. When Chomsky first advanced this idea in the 1960s, philosophers responded ambidextrously that the idea was absurd and that they had known it all along. It now seems inevitable. There must be an innate difference between a human being and a dog. One never stops talking, the other never begins. But if human beings are notable in their possession of Universal Grammar, what explains the acquisition of Universal Grammar by the human species?

This is a far more difficult problem.

Jean-Roger Vergnaud’s letter now belongs to the perpetual inventory of remembered things. The master dies with the matter. Linguistic theory has changed profoundly in forty years. Vergnaud’s letter remains a deeply moving document, the expression of his desire to see beneath the infernal arbitrariness of description to the place where unity prevails.