Electroencephalogram (EEG) is a medical imaging technology that can measure the electrical activity of the scalp produced by the brain, measured and recorded chronologically the surface of the scalp from the brain. The recorded signals from the brain are rich with useful information. The inference of this useful information is a challenging task. This paper aims to process the EEG signals for the recognition of human emotions specifically happiness, anger, fear, sadness, and surprise in response to audiovisual stimuli. The EEG signals are recorded by placing neurosky mindwave headset on the subject’s scalp, in response to audiovisual stimuli for the mentioned emotions. Using a bandpass filter with a bandwidth of 1–100 Hz, recorded raw EEG signals are preprocessed. The preprocessed signals then further analyzed and twelve selected features in different domains are extracted. The Random forest (RF) and multilayer perceptron (MLP) algorithms are then used for the classification of the emotions through extracted features. The proposed audiovisual stimuli based EEG emotion classification system shows an average classification accuracy of 80% and 88% using MLP and RF classifiers respectively on hybrid features for experimental signals of different subjects. The proposed model outperforms in terms of cost and accuracy.
In recent decades, human emotions related research studies are given more attention in neuroscience and affective Computing [ A new dataset of EEG signal and the labeled human emotion is developed using a novel approach. The EEG signals and human emotions are recorded in response to audiovisual stimuli using neurosky mindwave headset (NMH) device. In other studies the researcher have either used audio or visual stimuli but both stimuli are not used simultaneously while recording the EEG and human emotion. Features are extracted from EEG signals in three different domains namely time, frequency and wavelet domains to get a hybrid set of features for robustness of the classification. In some previous studies this approach is used however they used redundant features. In this study we avoided the use of redundant features to boost the computational power of the classifier without affecting the classification accuracy. The emotion of the subject is labeled by playing the audio and visual to the subject with emotional contents. RF is used as the main classifier in this study which is not used by other researchers. Its performance is also compared with other state of the art available classifiers.
The remainder of this paper is organized as follows: In Section 2, Materials and methods is briefly presented. In Section 3, we present the results and discussion. Finally, the concluding remarks with futuristic directions are provided in Section 4.
The EEG based emotions classification approach is shown in The subject (participant) is exposed to the stimulus and voltage differences of the subject mind are recorded, Artifacts and noises from the recorded signals are filtered out, Relevant features are extracted by analyzing the preprocessed signals, Classifiers are trained based on the selected features.
All these steps are briefly discussed in the following subsections.
A sound-attenuated and electrical room was used where the EEG signals were recorded while the participants were accommodated serenely on the chairs. In due course, 12 females and 28 males, aged between 18 to 23 years were designated as a subject in our research experiment. When the assent structures were topped off, the subjects were provided a straightforward presentation regarding examination of work and test phases. The layout of the proposed system and EEG recording protocol illustrated in Label the emotion that you encountered from this video cut? Intensify the emotions using 6 points rating. Any other emotion did you encounter at the same or higher intensity than the already stated emotion, and if experienced, label? Did you watch this video cut for the first time?
For elicitation of five emotions namely happiness, sadness, anger, fear, and surprise in our work, we efficiently utilized two business video cuts among ten for every specific emotion. Likewise, an initiation board study has been led on ten non-participating subjects in the trial. In continuation to this, they were opted to intentionally choose two video cuts. In this article, the chosen clips are not of the steady timeframe for properly recording the EEG data. In
EEG signals are contaminated by several sources of noise. As the EEG amplitude is very small i.e., in microvolts, thus these noises limit its usefulness. These are eliminated from the signals after converting to digital form. As the EEG signal is acquired with metallic electrodes, which adds a constant arbitrary voltage called DC-offset to the signal [
Features are defined as inherent properties of data which can be extracted from an EEG signals. It is vital to select them which reduces the processing costs and increase the reliability of classification. Observable features of the signal are extracted while watching video clips to extract information about subject’s mind. These features include statistical time, frequency and wavelet domains features of the signals. Different states of mind are associated with different frequency rhythms such as delta
Here Ξ
Several other statistical features used in this paper depend on the first and second derivatives of the signals. These features include mobility and complexity.
For classification, cross validation is applied on data to make the system robust against over fitting. All the data is divided into training and test sets based on 10-fold cross validation rule. To classify data MLP and RF are applied on the features extracted as discussed in the previous section and compare the result of actual
An offline analysis of EEG based emotions classification has been explored in this paper. EEG signals are captured for 40 subjects. All signals are recorded with the help of simple and low cost single channel NMH with a sampling frequency of 512 Hz. Zero DC-offset signals are attained by mean subtraction method. Twelve features of pre-processed signals are assumed. The features are further divided into four groups; group first consists of hybrid features, group second contains time domain features, group third consists of frequency domain features and group four consists of wavelet domain features as shown in
Set 1 hybrid features | Set 2 time |
Set 3 frequency |
Set 4 wavelet |
---|---|---|---|
Latency to amplitude ratio |
Latency to amplitude ratio |
Median |
Entropy |
The MLP and RF classifiers are then trained using 10-fold cross validation on each group. the classification accuracy on each group of the features set is computed. as shown in
Features’ set | MLP | RF |
---|---|---|
Set 1 | 80 | 88 |
Set 2 | 80 | 85.5 |
Set 3 | 77.5 | 83 |
Set 4 | 72 | 80 |
Classifier | Avg. accuracy % | F-measure | Mean absolute error | Kappa statistic |
---|---|---|---|---|
MLP | 80 | 0.800 | 0.109 | 0.74 |
RF | 88 | 0.880 | 0.120 | 0.82 |
Time domain features | Frequency domain features | Wavelet domain features | Hybrid features | |||||
---|---|---|---|---|---|---|---|---|
RF | MLP | RF | MLP | RF | MLP | RF | MLP | |
Anger | 77 | 62 | 75 | 63 | 77 | 65 | 80 | 67.5 |
Fear | 82 | 72 | 79 | 80 | 79 | 77 | 84 | 77.2 |
Happy | 85 | 80 | 82 | 77.5 | 80 | 72 | 90 | 84.8 |
Sad | 89 | 87 | 90 | 87 | 88 | 86 | 91 | 90 |
Surprise | 89 | 81 | 89 | 80 | 86 | 78 | 90 | 80 |
Kappa coefficient measure classification accuracy, the calculation of the kappa coefficient is based on the confusion matrix. The kappa calculation result is [−1, 1], but usually kappa falls between [0, 1], which can be divided into five groups to represent different levels of consistency: [0.0, 0.20] very low consistency (slight), [0.21, 0.40] Fair, [0.41, 0.60] moderate, [0.61, 0.80], and [0.81, 1] are almost perfect.
The confusion matrices of MLP and RF are shown in
Anger | Fear | Happy | Sad | Surprise | Class | Individual accuracy |
---|---|---|---|---|---|---|
27 | 6 | 1 | 4 | 2 | 67.5 | |
4 | 31 | 1 | 0 | 4 | 77.5 | |
3 | 0 | 34 | 1 | 2 | 85 | |
1 | 3 | 0 | 36 | 0 | 90 | |
4 | 2 | 0 | 2 | 32 | 80 |
Anger | Fear | Happy | Sad | Surprise | Class | Individual accuracy |
---|---|---|---|---|---|---|
33 | 2 | 1 | 3 | 1 | 82.5 | |
3 | 34 | 1 | 0 | 2 | 85 | |
1 | 2 | 36 | 0 | 1 | 90 | |
1 | 2 | 0 | 37 | 0 | 92.5 | |
2 | 1 | 0 | 1 | 36 | 90 |
Ref. | Stimulus | No. of em-otions | Equipment (frequency) | Feature extraction | Classifier | Accuracy (%) |
---|---|---|---|---|---|---|
[ |
Music videos (1 min); IAPS (40 s) | 6 | Nervus EEG (256 Hz) | WT (db4) | MLP | 67.33–Audio-visual stimuli; 63.35–Visual stimuli |
[ |
Music (30 s) | 4 | EEG neuroscan (500 Hz) | FFT | SVM RBF | 86.15–joy; 74.11–anger; 79.59–sadness; 83.59–pleasure |
[ |
Ekman’s picture set (5 s) | 6 | g.MOBIlab (256 Hz) | HOC | SVM polynomial | 83.33 |
[ |
POFA (5 s) | 6 | g.MOBIlab (256 Hz) | HAF-HOC | SVM | 100–happiness; 72.33–surprise; 96.67–anger; 79.22–fear; 96.11–disgust; 66.67–sadness; 58.75–happiness |
[ |
POFA (5 s) | 6 | g.Mobilab (256 Hz) | DWT, DFT, and Gabor | PNN | 67.05–surprise; 73.64–anger; 56.79–fear; 69.47–disgust; 62.97–sadness |
[ |
Video (4 min) | 2 | 62ch cap (1000 Hz) | DE, DASM, RASM, and ES | SVM | 84.25 |
[ |
DEAP (60 s) | 4 | Bio semi active two (512 Hz) | WT (db5), CC, SE, and AR | ML-SVM | 95.83–exciting; 90.97–happy; 96.52–sadness; 93.05–hatred |
[ |
Audio clips | 4 | Neurosky (512 Hz) | Statistical, PSD, FFT, and WT | MLP | 78.11 |
[ |
Video | 3 | EEG neuroscan (1000 Hz) | PSD | SVM gaussian | 73 |
Proposed | Video clips | 5 | Neurosky (512 Hz) | Statistical, FFT, and WT | MLP; RF | 80; 87.5 |
In this paper, we analyzed recognition of human’s emotions based on EEG signals using visual stimuli. Twelve selected features in time, frequency and wavelet domains are extracted and then used as input to the classifier. Five emotions namely anger, fear, happy, sad and surprise are classified using RF and MLP classifiers. From the results obtained for several subjects data, it is evident that the RF classifies emotions with highest accuracy.
Although the proposed model produced better results but improvement can still be made. One of the strands relates directly to the hardware such as multi-channel EEG device can be used for obtaining more accurate results. However, the acquisition of new equipment entails an increase in cost. The emotional induction is one of the most influential criteria in the emotional behavior of the subject and in the respective responses provided at the biological level, so that the improvement of multimedia content is one of the considerations to be taken into account for possible future work.