Context-Aware Audio-Visual Speech Enhancement Based on Computational Intelligence and User Preference Learning
Adaptive Speech Enhancement (SE) strategies in speech and hearing technologies are essential due to the diverse needs of users and the dynamic environments in which they communicate.
Traditional approaches often prove inadequate as they fail to consider individual listener preferences and specific contextual challenges. By incorporating user preferences and environmental context, adaptive strategies can customise SE to the specific situation and individual: thereby optimising battery life and overall sound quality while improving clarity and comprehension.
This personalisation ensures a more effective communication experience, overcoming the limitations of standard methods that may not suit all scenarios or meet the needs of individual users.
Outcomes
The research developed a novel application of neuro-fuzzy modelling for personalised AV speech processing. The Audio Visual (AV) SE model was able to improve speech clarity and intelligibility by contextually leveraging audio and visual information in a user aware manner.
The system was dynamically able to adapt SE strategies based on user-specific contexts, listeners environment and preferences. This made the system adaptable and capable of learning from user feedback to enhance the listening experience.
Partners
This research was a joint collaboration with Edinburgh Napier University as part of their EPSRC funded "Towards cognitively-inspired 5G-IoT enabled, multi-modal Hearing Aids" (COG-MHEAR) programme
Publications
- S. Chen, J. Kirton-Wingate, F. Doctor, U. Arshad, K. Dashtipour, M. Gogate, Z. Halim, A. Al-Dubai, T. Arslan, A. Hussain, "Context-Aware Audio-Visual Speech Enhancement Based on Neuro-Fuzzy Modeling and User Preference Learning," in IEEE Transactions on Fuzzy Systems, vol. 32, no. 10, pp. 5400-5412, Oct. 2024