Journal of Multi Disciplinary Engineering Technologies

Volume 19 • Issue 01 • Published: Dec 2025 • ISSN (Print): 0974-1771 • ISSN (Online): 2581-9372

Deep Learning Approaches for Multimodal Emotion Recognition: Trends, Issues, and Prospects

Kirti Sharma^1*, Rainu Nandal²

¹ University Institute of Engineering Technology, Maharshi Dayanand University, Rohtak, Haryana, India.

² University Institute of Engineering Technology, Maharshi Dayanand University, Rohtak, Haryana, India.

*Corresponding author(s): krtbhardwaj1@gmail.com

Contributing authors: rainunandal.uiet@mdurohtak.ac.in

View full text

Download PDF

Abstract

High-fidelity human-computer interaction now hinges on the machine’s ability to decode affective states—a task where single-source data (speech or text alone) consistently falls short. This paper pivots away from traditional unimodal constraints, centering instead on Multimodal Emotion Recognition (MER) as the primary vehicle for context-aware intelligence. We provide a rigorous deconstruction of current deep learning frameworks, specifically interrogating how different fusion topologies and network architectures handle cross-modal interference. Beyond a simple performance review, this survey exposes the “cracks” in state-of-the-art systems: the persistent struggle with cultural bias, the scarcity of high-quality labels, and the inherent opacity of deep-stack models. Our findings suggest that the next frontier lies not in larger models, but in “low-rank” self-supervised learning and Explainable AI (XAI). By prioritizing these lean, transparent methodologies, the field can finally move toward emotion-aware tech that is both ethically robust and deployable on the edge.

Keywords

Multimodal Emotion Recognition
Deep Learning
Transformers
Fusion
Affective Computing

Article information

Journal: Journal of Multi Disciplinary Engineering Technologies

Volume / Issue: 19 / 01

Published: Dec 2025

ISSN (Print): 0974-1771

ISSN (Online): 2581-9372