It's an old stereotype: he who hates mathematics curls up with a book, and she who revels in numbers is bored by fiction. But Franco Moretti, an English professor at Stanford University, believes that a full understanding of literature requires mathematical tools. He is inventing a new school of literary history based on statistical analysis of data about novels rather than close readings of the texts themselves.
"When we study literature, we really study a tiny, tiny portion of the literature that was actually published—around one percent," Moretti says. To understand literary trends as a whole, he asserts, "Close reading won't help: even if we read a novel a day every day of the year, it would take a century to read all the novels published in Britain in the 19th century."
For a wider understanding, Moretti believes, we need to approach literature as a science by applying quantitative methods that are widely used in other fields. "Potentially, it could redraw the whole map of literature," he says.
Moretti has uncovered some surprises. Traditionally, professors have taught that the novel rose during a single period in history, moving smoothly from obscurity to prominence. But Moretti charted the number of novels published in Britain between 1720 and 1850, and he found three distinct upward surges, each followed by a relatively stable period.
The first surge happened around 1720, when a new novel began to appear almost weekly in Britain instead of once a month or so. Moretti says this frequency allowed novels to become a regular part of British people's lives. A second surge that began around 1780 resulted in the publication of more novels than a single person could keep up with, encouraging readers to choose mostly new books rather than revisiting old ones. Moretti notes a concomitant collapse in sales of older books. A third surge, beginning around 1820, led readers to specialize in genres for the first time. Moretti published his findings in his 2005 book, Graphs, Maps, Trees.
Genres themselves also rose and fell in distinctive patterns, Moretti found. A new set of genres tended to rise about every thirty years and last for around 25 years—about the length of one generation of readers. But he is at a loss for an explanation of this pattern.
Moretti sees an analogy between the evolution of genres and the evolution of species. He notes in his book that whatever causes the appearance and disappearance of genres must be "like a sudden, total change of their ecosystem Books survive if they are read and disappear if they aren't: and when an entire generic system vanishes at once, the likeliest explanation is that its readers vanished at once." That still leaves the puzzle, though, of why literature's "ecosystem" would change every thirty or so years.
Currently, Moretti is analyzing the evolution of titles of British novels from 1740 to 1850. He presented his work on June 4 at the Digital Humanities 2007 conference in Urbana-Champaign, Illinois.
In the mid-1700s, novels typically had very long names. The full title of "Robinson Crusoe," for example, ran 68 words. As the market for novels became more competitive, titles became a marketing tool, so they needed to become pithier and more memorable. "At first this looks like a constraint," Moretti says. "But once writers are forced to write shorter titles, a whole new range of possibilities comes about. The basic issue is: how can just a few words suggest a lot of things?"
Moretti hopes ultimately to be able to deduce laws that govern the evolution of literature, and be able to make predictions about the past and future evolution of literature. Currently, however, such a small portion of all novels are publicly available in digital form that it is hard to formulate and test hypotheses. Moretti believes that time is on his side, however. "In a matter of years, maybe ten or fifteen years, all literature ever written will be in digital, searchable form," he says. "The question is: what do we do with this incredible archive?"
If you would like to comment on this article, please see the blog version.