09-07-2013, 04:56 PM
Determination of Angry Condition based on EEG, Speech and Heartbeat
Determination of Angry.pdf (Size: 360.3 KB / Downloads: 52)
Abstract
This paper determines the angry emotion condition by analyzing and recognizing speech signal,
EEG signal, as well as detecting the heartbeat. For the speech analyzing experiment, several digital signal
processing methods such as autocorrelation and linear predication technique was introduced to analyze
the features. Then, Artificial Neural Network (ANN) was used to classify each parameter features such as
mean fundamental frequency, maximum fundamental frequency, standard deviation fundamental
frequency, mean amplitude, pause length ratio and first formant frequency to recognize the emotion. For
the EEG analysis, the raw EEG signal was undergone preprocessing to remove the artifacts to minimal.
Some features as mean, standard deviation, the peak amplitude, the peak amplitude in alpha band (PAA)
and the peak frequency in alpha band (PAF) of the EEG signals were extracted. The selected features
were classified by using ANN to obtain the maximum classification accuracy rate. Meanwhile, a heartbeat
monitoring circuit was developed to measure the heartbeat. The result showed that angry emotion has
relatively low condition in mean value, maximum peak amplitude and relatively high peak frequency in
alpha band (PAF) of the EEG signals. The mean fundamental frequency, standard deviation fundamental
frequency and mean intensity of the speech signal are good in determining the angry emotion. This
method can be used further to recognize angry emotion of patient during counseling session.
INTRODUCTION
Emotion is a complex psycho-physiological experience of an individual's state of mind as interacting with
biochemical (internal) and environmental (external) influences. There are some models proposed in order to
classify and represent emotions. There are models use the idea that all emotions can be composed of some basic
emotions, just as colors can be composed of primary colors [1, 2]. For both theoretical and practical reasons
some researchers define emotions according to one or more dimensions. One of the popular versions is
dimensional or circumflex model, which uses the models of arousal and valence as shown in Fig. 1[3]. The third
dimension seldom used in research, it is not always the same even used. Dominance and motor activation
(approach or avoid) are ever used as the third dimension [3].
EEG Database
For this study, the EEG signals database from Enterface’06 (referred to as Enterface database) was used in
this project [9]. Enterface experiment was meant to provide a database of EEG and other physiological signals
for emotion recognition to public use, copy and publish.
Each participant was gone through three sessions of 30 emotional stimuli in Enterface experiment. Visual
stimuli were used in experiment to elicit the participant emotion, in particular images from International
Affective Picture System (IAPS). Each emotional stimulus consists of five same class images to ensure stability
of the emotion over time. Every image was displayed for 2.5 seconds leading to a total 12.5 seconds for each
emotional stimuli block. A 10 seconds dark screen to participants calm down their emotion. In the mean while,
the participant was asked to self-assess their emotional state. These self-assessments are distributed because
emotions are known to be very subjective dependent. One can never be sure that the person emotion after
viewing the images and this self-assessment can be used as evaluation for different participant. Fig. 4 shows the
Enterface experiment protocol.
Channel Selection
The EEG signals database was recorded data from participants using the Biosemi Active 2 acquisition
system with 64 EEG channels and the peripheral sensors. Until today, the researchers still cannot find the
specific region on skull where the brain activity is sufficiently high to detect an emotional state. Therefore, 15
channels were collected arbitrary and 1 channel was selected as the reference channel. The channels AF3, AF4,
F3, F4, FCz, C3, C4, CPz, P3, Pz, P4, POz, O1, Oz and O2 were selected for the feature extraction. The
reference channel is Cz.
The increased of the number of channels from 2 channels to 16 channels will increase the result of maximum
accuracy classification. However, it can also be noted that testing classification starts to decline after selecting
more than 20 channels, and decline become more rapid when selecting 40 or more channels. The declination
occurred due to more parameters for classifier to estimate, which make it harder to generalize, especially when
dealing with limited number of patterns [8].
Fig. 5 shows the maps of the placement of the electrodes according to the “10-20 electrodes placement
system”. The green colour circle channels are the selected channel and the red colour circle channel is the
reference channel.
Conclusion
The selected features of EEG signals were classified by using Artificial Neural Network (ANN) application
to obtain the maximum classification accuracy rate. The research result showed that a person in unpleasant
emotion have relatively low condition in mean value of the EEG signals and maximum peak amplitude EEG
channels compare to others two emotions. However, a person is in unpleasant emotion when the peak frequency
in alpha band (PAF) of the EEG signals is relatively high. In conclusion for EEG analysis, the mean value and
the maximum peak amplitude value of the EEG signal are indication of an unpleasant EEG signal emotion
pattern. When a person peak frequency in alpha band is relatively high, it is indicated that a person is in an
unpleasant emotion.
All the features parameters of speech signal from male and female has been obtained by using
autocorrelation and linear predication method in order to obtain mean fundamental frequency, standard
deviation fundamental frequency, maximum fundamental frequency, minimum fundamental frequency, mean
intensity, first formant frequency and pause length ratio. Among these features parameters, mean fundamental
frequency, standard deviation fundamental frequency and mean intensity show a good result in order to
differentiate between angry emotion and no angry emotion. This is because, during high degree emotion
(angry), speech was uttered faster with bigger volume and raised the tone due to produced changes in respiration
and an increase in muscle tension, which increase the vibration of the vocal folds, and variation of fundamental
frequency also increases. Therefore, mean pitch, standard deviation pitch and mean intensity was much higher.
However, during in low degree emotion (neutral and sadness), there were no much alteration in the fundamental
frequency and the value of fundamental frequency was very low as well as the intensity was much lower
compared to high degree emotion.