A physicist tackles the evolution of word order

Murray Gell-Mann was awarded the 1969 Nobel Prize in Physics for explaining the diversity of baryons and mesons in terms of more rudimentary particles, which he named quarks. If you enter his name into Google Scholar, the citation-based search engine duly returns his classic papers. In the top slot is the 1964 Physics Letters paper that earned him the Nobel.

Gell-Mann’s latest paper is not about quarks or complexity, the focus of his research for the past two decades. It’s not even about physics. In the 18 October issue of the Proceedings of the National Academy of Sciences (PNAS), you’ll find a paper by Gell-Mann and Stanford University linguist Merritt Ruhlen entitled “The origin and evolution of word order.”

By word order, Gell-Mann and Ruhlen refer to the basic sequence of subject (S), verb (V), and object (O). They classify English as SVO, as demonstrated by their example, “The man (S) killed (V) the bear (O).” Other word orders are found among the world’s 2000 or so languages. Welsh, the first language I studied in school, is VSO. The sentence about bear-killing reads “Fe laddodd (V) y dyn (S) yr arth (O).” Japanese, the last language I studied, is SOV, as in “Áî∑„ÅØ(S) ÁÜä„Çí(O) ÊÆ∫„Åó„Åü (V).”

Flag_of_Wales.jpg

Gell-Mann and Ruhlen start their fascinating paper by noting that three lines of evidence—from genetics, archaeology, and linguistics—all indicate that humans suddenly started using sophisticated tools and making objects of art around 50 000 years ago. “The cause of this abrupt change has been attributed to the appearance of fully modern human language,” they write in their introduction, “and this is a plausible conjecture.”

The sudden arrival of modern language suggests a single origin, a linguistic Big Bang. Under that assumption, Gell-Mann and Ruhlen sought to identify the word order of that ancestral modern language. To find it, they looked at the word order of 2011 languages and at those languages’ likely family trees. Their conclusion: The first modern language was SOV and that languages in general evolve in the order SOV → SVO → VSO. Welsh, it seems, is more evolved than English.

Gell-Mann isn’t the first eminent physicist to study linguistics. Thomas Young (1773–1829) proved the wave nature of light, founded physiological optics, and elucidated elasticity and capillarity. He also helped to decipher the ancient Egyptian hieroglyphs on the Rosetta Stone.

What do physics and linguistics have in common that attracted Young, Gell-Mann, and perhaps others? My hunch is the quest to find order. The list of 2011 languages that Gell-Mann and Ruhlen included in their study is available on the PNAS website. The list is 45 pages long! Making sense of that diversity is a challenging and worthwhile goal.

Charles Day

Hydra, fruit flies, and stripy colonies of bacteria

In 1952, two years before his untimely death at the age of 41, the mathematician Alan Turing wrote an influential paper entitled “The Chemical Basis of Morphogenesis.” The paper tackled the problem of how limbs and other structural patterns arise in plants and animals that begin life as undifferentiated blobs of cells.

Turing’s mechanism relies on the competition between a slow-diffusing chemical—a morphogen—that activates a reaction and a fast-diffusing chemical that inhibits the reaction. Nudging the reaction-diffusion system into a metastable state yields stable stripes, spots, and other patterns.

Judging by his paper’s abstract, Turing was inspired, in part, by Hydra, a genus of simple, water-dwelling animals whose body plan consists of a single sticky foot, a stem, and 1–12 thin, neurotoxin-charged tentacles. Although his mechanism presumes a continuous, two-dimensional system, its basic premise—that pattern development is controlled by the concentration-dependent diffusion and inhibition of signaling molecules—is observed in three-dimensional, multicellular systems, notably in biologists’ favorite fly, Drosophila melanogaster.

In common with other signaling molecules, morphogens initiate a complex chain of biochemical steps. For a flavor of that complexity, here’s how the University of Tokyo’s Testuya Tabata and Yuki Takei described the action of one Drosophila morphogen, Dpp, in a primer published in 2004:

The pathway that transduces the Dpp signal involves a combination of two types of serine/threonine kinase receptors, type I and type II. The activated type I receptor phosphorylates cytoplasmic transducers, so-called receptor-regulated Smads (named after the first-identified members of this family: Sma in C. elegans and Mad in Drosophila), which, upon phosphorylation, translocate into the nucleus and regulate the expression of target genes (Fig. 4A). In Drosophila wing development, Thickveins (Tkv) acts as a type I receptor; its constitutively active form (Tkv*), when ectopically expressed, can induce the expression of the target genes sal and omb (Fig. 4).

Each step in the Dpp pathway provides an opportunity for regulation, thereby helping to ensure that a larval fruit fly stays on course to develop properly functioning wings. Given the high biological stakes—a fly with malformed wings can’t feed or breed—the complexity of the Dpp pathway is understandable and evolutionarily inevitable.

A paper published today in Science is noteworthy because it demonstrates a simpler, albeit artificial, route for pattern formation in a multicellular system. Jian-Dong Huang of the University of Hong Kong, Terence Hwa of the University of California, San Diego, and their collaborators genetically modified Escherichia coli so that the single-celled organism would lose its mobility when crowded with other cells from the same mutant strain. Left to proliferate at the center of a nutrient-rich dish, colonies of the mutant spontaneously formed stable, concentric stripes of alternating high and low density.

The genetic engineering that underlay the pattern formation involved three basic steps:

  • Appropriating the density-sensing gene from another bacterium, Vibrio fischeri. Once equipped with the gene, the mutant E. colibacteria made and excreted a small molecule called AHL when they crossed a density threshold.
  • Controlling E.coli‘s mobility. Usually, when an E. coli bacterium senses a gradient in the concentration of a nutrient, it swims up the gradient. When it can’t sense a gradient, it stops, momentarily tumbles, then swims off in a different, random direction. Knocking out a gene called cheZ deprives E. coli of its ability to swim. The Huang–Hwa team modified the genome of their E. coli so that cheZwould be suppressed in the presence of a molecule called CI.
  • Linking density-sensing to mobility. Another modification caused CI to be synthesized in the presence of AHL, the molecule secreted when the mutant E. coli is crowded.

The stripes result from the bacteria’s density-dependent mobility. As a colony starts consuming nutrient and proliferating, the density of bacteria in the center rises and a front of low-density bacteria expands from the center into fresh nutrient. At the center, the proliferating bacteria cross the density threshold at which AHL, through the secretion of CI and the suppression of cheZ, deprives the bacteria of their swimming ability. Those central bacteria still eat and proliferate. As they do so, their density continues to rise until the nutrient is exhausted. The upshot is stable circular patch of high bacterial density.

Periods400.jpg

The bacteria just beyond the central patch are still mobile. Some of them move inward and become trapped; others move outward, remain mobile, and create a second high-density region behind the still-expanding front. As before, when the nutrient runs out, a stable patch of high bacterial density is left behind—this time in the shape of ring. Between the two high-density regions lies a stable low-density region that marks the zone where inward- and outward- moving bacteria met different fates. The creation of ring-shaped stripes continues until the front reaches the edge of the dish and the nutrient runs out.

By formulating a mathematical model of stripe formation, the Huang–Hwa team could predict the conditions under which stripes form and whether they form at all. And by adding an extra genetic modification, one that allows the suppression of cheZ to be tuned, the researchers could test their model. It passed.

The Huang–Hwa team closes its paper by noting that the formation of stripy colonies of bacteria suggests that periodic structures can form autonomously in individual organisms without the intervention of a biological clock. To me, the stripy colonies suggest something else: That the complex pattern formation mechanisms in Drosophila and other higher organisms evolved from something akin to the genetically engineered mechanism in the bacteria.

Charles Day

The ten thousand Kims

When Koreans marry, the wife retains her name, which is entered into the husband’s copy of his family genealogy (jokbo or chokpo in Korean). The practice, which reflects the Confucian reverence for one’s ancestors, has continued for centuries. As you might expect, jokbos are of great interest to historians. Less obviously, they provide a means for three physicists to test their statistical theories.

The physicists are Seung Ki Baek and Petter Minnhagen of Umeå University in Sweden and Beom Jun Kim of Sungkyunkwan University in South Korea. In a paper posted on the arXiv e-print server, they describe their analysis of women’s names recorded in 10 jokbos that go back 480 years.

Baek, Minnhagen, and Kim divided the jokbos’ 480-year span into 30-year intervals and for each interval tallied the number of women who joined the 10 families M, the number of different family names that those women possessed N, and the number of women who possessed the most common family name kmax.

KimHangul.jpg

The physicists wondered whether the changing values of M, N, and kmax could be reproduced by the random group formation (RGF) model. As a starting point, the RGF model assumes that groups (in this case groups of N women with the same family name) form through a mixing process that maximizes the entropy of a probability distribution (in this case the probability, PM(k), that a randomly chosen woman from a population of M has a family name that occurs k times).

The number and size of groups predicted by the RGF model depends on the sample size, which is what you’d expect for family names in real life. As more generations are recorded in the jokbos, the number and frequency of different family names increases. What makes the RGF distinctive is its history independence: For any generation, the frequency distribution of family names retains the same dependence on sample size.

That history independence might seem implausible, given how much famines, wars, industrial revolutions, and other traumas transform societies. To get the idea across, Baek, Minnhagen, and Kim make a comparison with the frequency distribution of words used by an author throughout his or her oeuvre. Because of its length and breadth, Leo Tolstoy’s 1440-page novel War and Peace has a different word-frequency distribution than does his 76-page novella The Death of Ivan Ilyich. Nevertheless, you can think of the two distributions as being drawn from the same single and very large “meta-book” that characterizes the novelist’s choice and use of words. Likewise Korean family names in the jokbos are drawn from the same “meta-registry” that reflects Korea’s enduring culture—provided the RGF model applies, that is.

In fact, it turns out that the RGF model does reproduce how N has varied with M and other patterns derived from the jokbos. What is the origin of the model’s success? Baek, Minnhagen, and Kim speculate that the answer lies in the stability of Korean culture:

It seems that some core of the Korean culture has remained intact over at least 1500 years and as both the population and occupied area expanded, it basically swallowed other cultural influences without compromising its core.

One of the RGF model’s predictions is that kmax, the number of women who have the most frequently occurring family name, is proportional to M, the sample size (not the case, according to the RGF model, for other family names). Kim is the most common name in the jokbos and, indeed, in Korea. (“Kim” is the name that appears in the accompanying photo.) By applying the RGF model, Baek, Minnhagen, and Kim estimate that in AD 500 Korea was home to 10 000 Kims.

Charles Day

Wine and climate change

I’ve just returned from a two-week vacation in England and Wales. The picture shows one of the first places that my wife and I visited: Stokesay Castle, which, according to its official website, “is quite simply the finest and best preserved fortified medieval manor house in England.”

Stokesay.jpg

I took the picture using my mobile phone. You can probably tell what that late September day was like: cool and damp. By the end of the vacation, the weather over England and Wales had changed significantly—for the better. The last full day, Saturday, 1 October, was the hottest October day ever recorded in Britain. In Gravesend, a town on the Thames Estuary, the temperature reached 29.9 °C (85.8 °F).

The unusually warm weather brought to mind a news story I’d read recently on the BBC’s website. Writing for the website’s Food and Drink department, Suemedha Sood reported on evidence that climate change is affecting the world’s wine-growing regions.

Winemakers grow grape varieties that suit the local soil and climate, what the French call the terroir. Through centuries of trial and error, wine makers in Italy’s Piedmont region, for example, have discovered where best to cultivate Nebbiolo grapes for Barolo wine, one of my favorites. In making their decisions about what to plant and where, winemakers in the New World make use of both Europe’s accumulated knowledge and the research conducted by enologists and viniculturalists.

But however you match grape to terroir, the best wines come from narrowly defined regions. And when the climate changes, those regions change too. In her BBC story, Sood mentioned research that warned of a shift in the location and size of America’s premium wine-producing regions.

Climate change has already affected winemaking in Britain. When William Shakespeare wrote Henry V around 1599, Europe had already spent 50 years in a period of lower-than-normal temperatures known as the Little Ice Age. To Shakespeare, the possibility that grapes might thrive in English soil must have seemed remote. One of the characters in Henry V, the French constable, contrasts the English and the French by comparing the climate and habitual beverages of their native lands:

Dieu de batailles! where have they this mettle?
Is not their climate foggy, raw and dull,
On whom, as in despite, the sun looks pale,
Killing their fruit with frowns? Can sodden water,
A drench for sur-rein’d jades, their barley-broth,
Decoct their cold blood to such valiant heat?
And shall our quick blood, spirited with wine,
Seem frosty? O, for honour of our land,
Let us not hang like roping icicles
Upon our houses’ thatch, whiles a more frosty people
Sweat drops of gallant youth in our rich fields!
Poor we may call them in their native lords.

Whether climate change will continue to benefit winemaking in Britain isn’t clear. If its only effect were higher temperatures, the answer is likely to be yes. Britain lies just within the northern limit of viniculture in Europe. Higher temperatures would extend both the growing area of wine-worthy grapes and their variety.

But climate change could also bring more rain. A climate foggy and dull, even one that’s warm not raw, is not good for fine wine.

Charles Day