Multimodal Emotion Recognition: Integrating Text, Speech, and Facial Expressions for Enhanced Human‐Computer Interaction

Takneshan, Mohammad

Multimodal Emotion Recognition: Integrating Text, Speech, and Facial Expressions for Enhanced Human‐Computer Interaction

Fájlok

thesis(968.88 KB)

Szerzők

Takneshan, Mohammad

Absztrakt

This thesis presents a multimodal emotion recognition system integrating text, audio, and facial expression modalities using attention-based feature-level fusion. The system achieves 71% accuracy for seven-class emotion recognition and 85% for sentiment analysis on the MELD dataset benchmark. The thesis provides a realistic assessment of both the potential and current boundaries of MER systems, establishing a solid foundation for future research while acknowledging important limitations regarding real-world generalization and the need for culturally-specific models.

Kulcsszavak

Multimodal Emotion Recognition, Affective Computing, Human-Computer Interaction, Attention Mechanisms, Transfer Learning, Deep Learning

Hivatkozás

https://hdl.handle.net/2437/404525

Gyűjtemények

Theses (IK)

A tétel részletes nézete