Speech is processed in chunks

How can we understand a continuous flow of speech when the capacity of our working memory is so small? This is what linguists and neuroscientists are currently working on. The answer may lie in language chunking.

We understand speech most of the time without paying attention to the process. We are able to pick up the essential from the flow of speech in a conversation, even if the speech stream moves rapidly forward, and the old speech material dies off quickly to make way for the new.

Linguists and neuroscientists have now joined forces to solve this mystery. Linear Unit Grammar, a theoretical model for language chunking, serves as the starting point.

“Our assumption is that speech is understood by chunking it into pieces suitable for our working memory”, describes the leader of the project, Professor of English Anna Mauranen.

This assumption will be tested through linguistic and neuroscientific methods. In the studies conducted by linguists, subjects will work with an iPad with a sample of recorded speech and a transcription of this same extract. The subjects will be asked to mark the points that they identify as the boundaries of chunks.

In the project, Anna Mauranen works closely with Svetlana Vetchinnikova. And Mauranen could use all the help she can get, as she currently spends most of her day attending to the duties of her position as vice-rector of the University of Helsinki. The team has also received invaluable help from research assistant Nina Mikusova, a linguistics student who has also mastered programming.

In the next stage of the study, neuroscientists Satu Palva and Matias Palva will investigate whether the boundaries of chunks are discernible in a brain scan.

What is the distinction?

The Linear Unit Grammar model, which provides the basis for the project, was developed by Anna Mauranen together with John Sinclair as early as 2006. According to this model, language is linear and is made up of units, which can be words or sentences or something in between. Sometimes the units can be smaller than words (e.g., well or oh).

We continuously process the utterances we hear. The assumption is that chunking helps in this rapid process. Some chunks have interactional or organising functions: “I don’t know what to say”, “I could tomorrow”, “let’s stop and think…”. You can mumble long, but well-known, units unclearly, and still your listener will probably understand what you are saying: “the University admissions board”, “teaching and research staff”, “the Finnish Food Safety Authority".

It took some time for the linguistic community to take note of Mauranen’s and Sinclair’s theory. It also took a while for Mauranen to find partners from among neuroscientists. She was referred from one scientist to another, until she found the Palva couple, who had been studying related issues from a medical point of view. The project was coined Kielellinen palastelu: merkityksen ja prosessoinnin yksiköt (Language chunking: units of meaning and processing).

Helpful in language learning

If the project succeeds as planned, in a few years the results can be usefully applied in the diagnosis of linguistic disorders and in language learning.

“We can, for example, increase our understanding of how language learners analyse language. We know already that speakers’ experience of language often conflicts with grammatical descriptions of language”, says Anna Mauranen.

The project sets off under favourable circumstances, as it just received a quarter of a million euros in funding from the Finnish Cultural Foundation. In its annual celebration on 27 February, the Cultural Foundation awarded a total of 25 million euros in grants to the arts and sciences. Research in the humanities was well represented among the top projects this year.

Grant recipients announced in the annual celebration of the Finnish Cultural Foundation on 27 February (in Finnish)

Anna Mauranen on the 375 Humanists site

Video: The video shows a test subject marking the points in the text that she sees as the boundaries of meaningful units.