Accueil > Personnel

Mamady Nabé


Mamady Nabé

Bat Michel Dubois - bureau E115
CS40700 - 38058 Grenoble Cedex

Thesis topic

Bayesian Modeling of Speech Perception with a temporal treatment inspired by neural oscillations properties.


Julien Diard
Jean-Luc Schwartz

Thesis Summary

The issue of the relationship between perception and production mechanisms is central to many domains in cognitive science. In speech communication for instance, where predictions from speech production simulation interact in various ways with perceptual processes. In this context, we have developed COSMO (Communicating about Objects using SensoriMotor Operations), a family of Bayesian algorithmic models of communicating agents. We have previously used such models to study the evolution of phonological systems (Moulin-Frier et al., 2015), speech perception and learning (Laurent et al., 2017 ; Barnaud et al., 2017 ; 2018), and speech production and adaptation (Patri et al., 2015 ; 2018).

However, so far, these models consider greatly simplified temporal dimensions. For instance, syllable perception was restricted to consonant-vowel syllables, assuming that the key points of speech trajectories, respectively at the center of the consonant and the vowel, were previously identified. This, of course, contrasts with natural speech processing, where sensory inputs and motor controls continuously unfold over time. Indeed, the neuronal substrates, in the brain, that deal with auditory input are well described in terms of their oscillatory nature, since they intrinsically have to deal with temporal properties of speech, and their predictive nature, since they aim at anticipating events.

In this PhD project, we aim at extending the COSMO framework to define the first Bayesian perceptuo-motor model of continuous speech communication. In previous work, in the domain of Bayesian word recognition modeling (Phénix et al., 2018 ; Ginestet et al., 2019), we have developed mathematical tools to describe the temporal dynamics of perceptual evidence accumulation across layers of hierarchical representations. Probability distributions at each layer (letters and words) evolve continuously over time, as a function of bottom-up sensory evidence and top-down lexical constraints, to predict upcoming events. Crucially, we have developed mathematical tools to model, on the one hand, attentional control of these information flows, and, on the other hand, asynchronous and asymmetric information transfer. Applying these mathematical constructs to speech communication modeling would yield a novel class of Bayesian hierarchical and predictive models, able to account for observations of neuronal oscillatory systems in the brain. The collaboration with Geneva will provide a unique framework for mixing the Bayesian approach with neuroscience constraints and data, providing a valuable multidisciplinary environment for the PhD evolution.

Topics of Interest

  • Bayesian Modeling
  • Speech Perception
  • Speech Recognition
  • Cognitive science
  • Artificial Intelligence

Funding statement

This work is supported by the French National Research Agency in the framework of the Investissements d’avenir program (ANR-15-IDEX-02 ; PhD grant to MN from Université Grenoble Alpes ISP project Bio-Bayes Predictions). Authors also acknowledge additional support by the Auvergne-Rhône-Alpes (AURA) Région (PAI-19-008112-01 grant).

Voir en ligne : Follow me on RG