February 25, 2008

We at Science News read a lot of journal articles—scores and scores each week. The choicest few become fodder for what ends up filling our pages. But poring over all of those papers makes us quite familiar with a topic—“the bleak sensory landscape” of scientific texts—discussed in a commentary released today in, of all places, EMBO reports. It’s a publication of the European Molecular Biology Organization.

The authors, an electrical engineer at Columbia University and geneticist at the University of Chicago, used data-mining algorithms to tally the number of times various sensory terms occur in different types of texts: news articles issued by the Reuters wire service; literary works of Shakespeare, Poe, and Whitman; the open-access Wikipedia; and nearly a quarter-million scholarly papers published within 78 biomedical-research journals.

Sensory terms might include time—hours or picoseconds, colors—azure or dark red; smells—cheese-scented or sulfurous; even tactile sensations—glassy or sandpapery.

“Compared with Whitman and Reuters, sensory terms are almost absent from [research] journals,” the authors find. Indeed, they report, “Journals are among the poorest in time-related terms.” What may come as a surprise, Shakespeare fared even worse in the new tally of sensory terms than did the research journals. So did Wikipedia, though I, for one, am less astonished about that particular observation.

“We conjecture that a piece of sensory-poor prose does, on average, a poorer job of engaging the reader’s imagination than a sensory-rich one,” write Paul Rodriguez-Esteban and Andrey Rzhetsky.

No duh!

Sensory terms help us identify with—even begin to viscerally feel—material and ideas. A richer use of sensory language may help us better understand details and cement them into memory. I’m more likely to recall a story about a mottled brown tomato than about one that was merely described as discolored. A nearly microscopic yellow spider with a viscous slimy orange goo oozing from behind its eyes: That I can picture—and remember. But some tiny member of a newly discovered arachnid genera characterized by a periodic subocular secretion. That’s easy to dismiss and forget.

How ironic, then, that the 4-page EMBO reports “viewpoint” by Rodriguez-Esteban and Rzhetsky is about as devoid of sensory terms as the texts it takes issue with. It’s also needlessly vague.

For instance, I would have appreciated learning how journal papers, collectively, compared quantitatively with Whitman or Wikipedia—not just learned that they broadly qualified one source as “better” or “worse.” I’d also like to know how the authors scouted for sensory terms. When there are thousands available, did they have to key them all in, or just count the use of several common ones? I wished they’d helped me visualize the process they used, getting an idea for how long and arduous the computations were—or conversely, whether they were finished in the blink of an eye.

Were all journals equally sparse in their sensory descriptors? Nothing in the article gives us a cue. What’s more, the language in the paper tends to be quite dense—something that in itself can obscure meaning. Consider this passage:

“Multidimensional scaling in its ‘true’ form involves reconstructing a geographical map from a set of known distances between cities. In our case, we used it to arrange points corresponding to our six corpora on a plane, so that the resulting distances are as close as possible to the Euclidean distances between corpus-specific vectors of frequencies of sensory terms…”

Such abstract language is hardly something your grandmother, PTA president, or dry cleaner could be expected to understand, much less care about.

Then again, perhaps I shouldn’t complain. It’s writing like that in the EMBO reports commentary that confirms the need for professional interpreters, like Science News.

