Speech Emotion Classification Using Mel-Frequency Cepstral Coefficients (MFCCs) and Feature Selection Based on Entropy with Genetic Algorithm

Research

Title	Speech Emotion Classification Using Mel-Frequency Cepstral Coefficients (MFCCs) and Feature Selection Based on Entropy with Genetic Algorithm
Type	Thesis
Keywords	Speech emotion recognition, Mel Frequency Cepstral Coefficients, Convolutional Neural Network, Genetic Algorithm
Year	2023
Researchers	Seyfollah Soleimani(PrimaryAdvisor)، Zahraa Muhsen Neamah(Student)

Abstract

Speech emotion recognition (SER) is a challenging and fundamental task. SER systems are capable of identifying emotions of different audio signals. Recently, machine learning techniques have presented promising results in this field. Speech emotion recognition is a classification issue from the viewpoint of machine learning, where an input sample (audio) has to be divided into a few preset emotions. In this thesis, due to the fact that Mel Frequency Cepstral Coefficients (MFCC) features extract the components that determine the auditory perception in each person, extracting this feature is considered as the main data for emotion classification. After the features were extracted, in the next step, we selected the features based on the information entropy using the genetic algorithm. Finally, after the optimal features were selected, an 8-layer convolutional neural network with three fully connected layers, was employed to classify the selected features. The results indicated that the accuracy of the network for 65 MFCC coefficients and 39 MFCC coefficients achieved 80.1% and 79.6%, respectively.

Seyfollah Soleimani

Research

Abstract