Journal of Multi Disciplinary Engineering Technologies
Volume 19 • Issue 01 • Published: Dec 2025 • ISSN (Print): 0974-1771 • ISSN (Online): 2581-9372

Deep Learning Approaches for Multimodal Emotion Recognition: Trends, Issues, and Prospects

Kirti Sharma1*, Rainu Nandal2

1 University Institute of Engineering Technology, Maharshi Dayanand University, Rohtak, Haryana, India.
2 University Institute of Engineering Technology, Maharshi Dayanand University, Rohtak, Haryana, India.
*Corresponding author(s): krtbhardwaj1@gmail.com
Contributing authors: rainunandal.uiet@mdurohtak.ac.in

Abstract

High-fidelity human-computer interaction now hinges on the machine’s ability to decode affective states—a task where single-source data (speech or text alone) consistently falls short. This paper pivots away from traditional unimodal constraints, centering instead on Multimodal Emotion Recognition (MER) as the primary vehicle for context-aware intelligence. We provide a rigorous deconstruction of current deep learning frameworks, specifically interrogating how different fusion topologies and network architectures handle cross-modal interference. Beyond a simple performance review, this survey exposes the “cracks” in state-of-the-art systems: the persistent struggle with cultural bias, the scarcity of high-quality labels, and the inherent opacity of deep-stack models. Our findings suggest that the next frontier lies not in larger models, but in “low-rank” self-supervised learning and Explainable AI (XAI). By prioritizing these lean, transparent methodologies, the field can finally move toward emotion-aware tech that is both ethically robust and deployable on the edge.

Keywords

Multimodal Emotion Recognition
Deep Learning
Transformers
Fusion
Affective Computing

Article information

Journal: Journal of Multi Disciplinary Engineering Technologies
Volume / Issue: 19 / 01
Published: Dec 2025
ISSN (Print): 0974-1771
ISSN (Online): 2581-9372