Multimodal Emotion Recognition: Integrating Text, Speech, and Facial Expressions for Enhanced Human‐Computer Interaction
Fájlok
Dátum
Szerzők
Folyóirat címe
Folyóirat ISSN
Kötet címe (évfolyam száma)
Kiadó
Absztrakt
This thesis presents a multimodal emotion recognition system integrating text, audio, and facial expression modalities using attention-based feature-level fusion. The system achieves 71% accuracy for seven-class emotion recognition and 85% for sentiment analysis on the MELD dataset benchmark. The thesis provides a realistic assessment of both the potential and current boundaries of MER systems, establishing a solid foundation for future research while acknowledging important limitations regarding real-world generalization and the need for culturally-specific models.
Leírás
Kulcsszavak
Multimodal Emotion Recognition, Affective Computing, Human-Computer Interaction, Attention Mechanisms, Transfer Learning, Deep Learning