Thèse / Equipe Vision et Emotion
From 1 September 2023 to 30 August 2026
Contribution of multi-modality to embedded incremental learning
The massive proliferation of DL methods in all fields of application has raised awareness of the two limitations of these very powerful methods for statistical data processing: the first is the exorbitant cost (both energy and financial) of training neural networks, and the second is the dependence on access to massive quantities of data and, above all, their annotations.
One way of getting around these limitations is to train the network as close as possible to the sensors, in order to limit the need for massive data movements and the associated infrastructure costs, as well as continuous learning to respond to data that may not all be available initially. The Edge computing trend is now attempting to bring computing closer to sensors by integrating AI into low-power electronic devices. This approach is already well studied, particularly in terms of efficient neuromorphic architectures, but it still only concerns the inference phase.
In this thesis project, we focus on the learning phase by studying the on-line learning capabilities of artificial neural networks. Although our ultimate aim is to develop techniques that benefit from the advantages of these neuromorphic architectures, the first phase of this study will consist of exploiting formal neural networks (DL) in order to create a basis for comparison in terms of the accuracy of our models, their computing cost, their data requirements and their energy consumption. By collecting data from different (multimodal) sensors, these intelligent systems can detect changes in the data and the environment that generates them. These changes may require the underlying neural network to learn incrementally, either to adjust its model to new conditions, or to learn new categories depending on the problem being addressed. In all cases, this relearning must be carried out in a reduced time and energy budget.
The problem of on-line learning has to contend with a number of obstacles that are still not the subject of consensus in the literature: novelty detection, learning with few or no labels, and catastrophic forgetting in the face of continuous learning of new data.
Yet the biological brain naturally manages the constant changes in our environment from a very early age right through to adulthood. It has a wide range of plasticity capacities, which are revealed at several levels of its organisation. In particular, it exploits the spatio-temporal correlations arising from the different sensory modalities it uses to apprehend its environment. These modalities merge and complement each other while being processed and routed by different neural pathways.
In this project, we therefore want to study how projection between modalities can help to improve the quality of lifelong learning, by overcoming the problem of catastrophic forgetting while reducing the need for data annotation.
BENOIT MIRAMOND benoit.miramonduniv-cotedazur.fr (benoit[dot]miramond[at]univ-cotedazur[dot]fr)
Marina REYBOZ marina.reybozcea.fr (marina[dot]reyboz[at]cea[dot]fr) (Codirection)
Martial MERMILLOD martial.mermilloduniv-grenoble-alpes.fr (martial[dot]mermillod[at]univ-grenoble-alpes[dot]fr) (Codirection)
Incremental learning, Multimodal learning, Neuromorphic architectures, Embedded artificial intelligence
CEA - Bourse CTBU