Wha’dja say?

Scientists find phonetic shortcuts in speech

April 21, 2010 at 5:12 pm

BALTIMORE — Talk may be cheap, but that doesn’t keep people from budgeting their speech.

When talking casually, people routinely streamline their utterances by dropping segments, syllables and even whole words, researchers reported April 20 at the spring meeting of the Acoustical Society of America. Insights into these conversational shortcuts could improve the learning and teaching of second languages. They could also help scientists design better speech recognition programs, which are typically attuned to carefully enunciated words, not everyday talk. “Most of the speech we communicate with is not careful speech at all,” said Natasha Warner, director of the Douglass Phonetics Lab at the University of Arizona in Tucson. “You try and give this stuff to a speech recognition program, and it totally goes to pieces.”

To parse the spontaneous speech of everyday life, Warner and her colleagues had 13 undergraduate students sit in a sound booth, each with a recording microphone on one ear and a telephone on the other. The researchers recorded the students having 10-minute conversations with a friend or family member. The same students were also recorded while reading a story in which particular words were embedded and while reading a list of words.

The researchers then analyzed several aspects of the students’ speech, such as how long consonants lasted, the presence or absence of “bursts” (when the lips come apart, as in “apple”) and the duration of vocal cord vibration.

The standout findings were not what was said during casual conversations, but what wasn’t. “Reduction is the norm, not the exception,” said Warner. “It’s massive — syllables are gone.”

For example, when a speaker uttered the phrase “We were supposed to see it yesterday, but I felt really bad,” the word “yesterday” shrank considerably. When said carefully, the word is three distinct syllables. But within a spoken sentence it is often reduced to a two-syllable, unidentifiable “yesh-ee” (see sound file). In another example, Warner found that in the sentence “I can’t register in person, so they’re just going to have to live with that,” the phrase “going to have to” was reduced not just to “gonna hafta” but all the way down to “got-da.”

Other words appear to get lost entirely, Warner noted.

The research isn’t pertinent just for trying to understand regular speech, but also for creating it. Speech-generating devices and therapies for people who have had laryngectomies or can’t speak due to conditions such as ALS might also be aided by the research, said Sandra Combs, a specialist in the Communications Sciences and Disorders program at the University of Cincinnati in Ohio. “This could be really useful on the synthesis side, for people who want to sound casual, more natural.”

This natural speech isn’t lazy, it’s just more efficient — as long as a listener can still understand, said Warner. And while the study focused on undergraduates, this efficient speech isn’t just for the young. Warner’s research was prompted in part when she noticed that she dropped sounds while asking her son if he wanted a peanut butter and jelly sandwich (The “t” sound in butter tends to disappear). “I thought if I’m doing that,” Warner said, “how often does it happen, and what does the listener do?”