Skip to main content

Thesis Marion MAINSANT


From 1 March 2020 to 31 December 2023

Continual Learning for Multimodal Fusion

The human brain continuously receives new information from external stimuli. Information received from each senses is collected, analyzed and combined with those of other senses (vision, hearing, touch etc…) in order to be interpreted. Each new information does not overwrite with previously learnt ones but comes to extend the brain knowledge.
Artificial intelligence deep learning algorithms aims to simulate this type of learning. Nevertheless, for now, computers can have many sensors that receive external information but they do not necessarily communicate with each other to share and “understand” the global information. Furthermore, when deep learning algorithm learns new knowledge, it overlaps them with old ones and most of the time, old knowledge are forgotten. We name this type of forgetfulness, catastrophic forgetting. Behind these observations, we find one of the major challenge of tomorrow’ deep learning systems: How intelligent system could adapt in a changing environment? Could robot be adaptable to everyone?
To answer those questions, researchers introduced the notion of incremental learning, personalization and multimodality that are three growing research fields in a global field of deep learning called life-long learning.
An incremental learning algorithm is currently developed in our research laboratory. Results obtained with it are already encouraging for datasets like MNIST, CIFAR10 and CIFAR100 (Solinas et al.). This type of algorithm enables to overcome catastrophic forgetting of previously learnt classes. Some researchers proposed to use the advantage of incremental learning for the learning of new instances of known classes (Lomonaco and Maltoni). This type of use of incremental learning could give the possibility to personalize its algorithm to new unknown instances. In parallel, an interesting paper explored the learning of two modalities in a spiking neural network: audio and image for MNIST dataset (Rathi and Roy 2019) and shows that multimodality as the advantage to improve accuracy and to be more robust to noisy data.
We would like to position our thesis project in the heart of these researches and propose a framework that answers the burning deep learning issue of an incremental multimodal learning which can be also adapted to personalization question.

- Martial MERMILLOD - (martial[dot]mermillod[at]univ-grenoble-alpes[dot]fr)
- Marina REYBOZ - (marina[dot]reyboz[at]cea[dot]fr) -
- Christelle GODIN - (christelle[dot]godin[at]cea[dot]fr)

Keywords : Continual Learning,Incremental learning,Emotion detection,Multimodal Fusion,Deep Learning,Personalization,


From 1 March 2020 to 31 December 2023


Carnot Exploratoire CEA - Dotation des EPIC et EPA

Submitted on 17 November 2023

Updated on 17 November 2023