Speech and Language Processing

The recognition of L2 learners’ speech to text has several challenging aspects for the technology and machine learning methods.

They include understanding non-native and heavily accented and disfluent speech. The task is also severely under-resourced as there are no matching speech databases for Finnish or Swedish that could be used for training the models. At Aalto University we will apply our state-of-the-art speech recognition expertise to apply several speaker adaptation methods to bridge the gap between the native speech models and the individual L2 learners. The more accurately we can recognize what was said, the more reliable will be the automatic assessment of oral skills.

In addition to the recognition of speech-to-text, the automatic assessment of oral skills based on the automatic speech recognition and segmentation results is yet another interesting and challenging problem. It includes the evaluation of pronunciation, prosody, fluency, vocabulary, and grammatics. All of these tasks require advanced machine learning methods to learn the underlying latent variables that can connect the speech samples to the official oral skill levels defined by human experts.

Researchers in the project team are Mikko Kurimo, Aku Rouhe, Ekaterina Voskoboinik, Ragheb Al-Ghezi, Yaroslav Getman, Clara Akiki, and Mateiu Tudor.