CNN Based Driver Drowsiness Detection System Using Emotion Analysis

The drowsiness of the driver and rash driving are the major causes of road accidents, which result in loss of valuable life, and deteriorate the safety in the road traffic. Reliable and precise driver drowsiness systems are required to prevent road accidents and to improve road traffic safety. Various driver drowsiness detection systems have been designed with different technologies which have an affinity towards the unique parameter of detecting the drowsiness of the driver. This paper proposes a novel model of multi-level distribution of detecting the driver drowsiness using the Convolution Neural Networks (CNN) followed by the emotion analysis. The emotion analysis, in this proposed model, analyzes the driver’s frame of mind which identifies the motivating factors for different driving patterns. These driving patterns were analyzed based on the acceleration system, speed of the vehicle, Revolutions per Minute (RPM), facial recognition of the driver. The facial pattern of the driver is treated with 2D Convolution Neural Network (CNN) to detect the behavior and driver’s emotion. The proposed model is implemented using OpenCV and the experimental results prove that the proposed model detects the driver’s emotion and drowsiness more effectively than the existing technologies.


Introduction
The increase in population and the usage of the automobile has increased the negative outcomes of road accidents, deadly injuries, loss of valuable life, financial losses, and non-recoverable health and mental illness. The National Crime Records Bureau (NCRB) has released a report during the year 2020 on the statistical analysis of road accidents [1]. The report states that there are around 5 Lakhs of road accidents which have been reported in one year, among which 69% creates a high level of damage to life and property. The report extends to the analysis of factors influencing road accidents. Driver drowsiness and mentality are the vital factors for road accidents and rash driving [2]. The drowsiness of the driver may be due to restless driving, fatigue, consumption of alcohol while the mentality relates to extreme anger, frustration, and sometimes extreme happiness. Based on the analysis report, the driver's behavior is the vital cause for road accidents, which motivates a lot of researchers to be involved in monitoring and detecting the driver's drowsiness systems. Some notable research results were implemented in real-time to eradicate the road accidents, although the count of road accidents and the loss of valuable life are escalating radically. Fig. 1 depicts the statistical analysis of the fatal crashes due to driver fatigue and reckless driving. It is clear from the statistics that the fatal count due to driver fatigue and reckless driving is almost equivalent to each other, whereas the present monitoring system concentrates only on the driver's drowsiness. Certain driver monitoring systems detect driver drowsiness, whereas some systems monitor the vehicle acceleration and the driver's eye movement [3][4][5]. In recent days, vehicle manufacturers design their vehicles integrated with driver drowsiness detection systems like Advanced Driver Assistance Systems for monitoring both the driver behavior and the vehicle acceleration system. The Advanced Driver Assistance Systems consists of an eye movement monitoring sensor, hardware, and automation software that improves the performance of the driver drowsiness detection system [6]. In place of the evolution of Advanced Driver Assistance Systems, the count of road accidents escalates due to rash driving by the drivers. To address this concern of the Advanced Driver Assistance Systems, the proposed model is an integrated monitoring system of detecting the driver drowsiness and the mentality of the driver which is the vital factor for rash driving by the drivers. The integrated monitoring system is performed by the Convolution Neural Networks (CNN) for detecting the fatigue of the driver while the emotion of the driver is identified using a novel self-developed tool, Driver Emotion Detection Classifier (DEDC). The Driver Emotion Detection Classifier is a tool with a trained database that extracts the feature from the recorded video to analyze the emotion of the driver. Depending on the emotion of the driver, the prerecorded suitable song is played to neutralize the mentality of the driver so that to avoid reckless driving to prevent the road accident [7,8]. The major contributions of this research paper are: A driver drowsiness detection system based on Convolution Neural Network (CNN), for detecting the fatigue of the driver and the acceleration system of the vehicle. The mentality of the driver is monitored using a self developed novel tool, Driver Emotion Detection Classifier (DEDC), in which the mentality of the driver is distributed into multi-levels like anger, disgust, fear, happiness, sadness and neutrality.
The organization of the paper is classified into 5 sections with Section 2 describing the previous research works related to the driver drowsiness monitoring systems, followed by section 3 that narrates the proposed system. The experimental results were analyzed in Section 4 and finally the conclusion in Section 5.

Related Works
Numerous researchers are actively involved in determining the solution for road accidents due to driver drowsiness. The plentiful research results have been classified into five categories of driving like normal driving, fatigue driving, reckless driving, drunken driving, and distracted driving. Some of the

Number of Fatal Crashes
Year Driver Fatigue Reckless Driving Figure 1: Statistical analysis of accidents due to driver fatigue and reckless driving 718 notable research results were illustrated from which the proposed system with enhanced performance has been designed.
de Naurois et al. (2018) designed a model for predicting the drowsiness of the driver using Artificial Neural Networks [9]. The system works on the heartbeat rate analyzing principle which is fed as the input to the Artificial Neural Network (ANN) to detect the drowsiness of the driver. The experimental analysis proves that the system has about 80% of accuracy in detecting the drowsiness of the driver. Jabbar et al. (2018) has designed a real-time driver drowsiness detection system using the android mobile application using Deep Neural Network (DNN) techniques [10]. The proposed method was designed based on the Deep Learning method integrated with the Android mobile application. The system achieved an accuracy level of 80% based on the experimental analysis.
de Naurois et al. (2017) proposed a driver drowsiness detection model based on the Artificial Neural Networks (ANN) that detects the eye blink duration and its frequency as the major input to the Artificial Neural Network (ANN) [11]. The model identifies the drowsiness of the driver with an error of 0.22 and detects at a rapid rate with the mean square of 4.18 minutes. Moujahid et al. (2021) has proposed an efficient and compact face descriptor for detecting driver drowsiness with several approaches of face expression detection, multilevel face representation, and has compared with the dataset of NTH Drowsy Driver Detection (NTHDDD) [12]. The proposed framework is proven to be efficient at par with the performance using a convolution neural network. Network algorithm (DBN) and the algorithm yields a lower false-positive rate than the existing PERCLOS which is the present standard for the driver drowsiness detection system [14]. Phanikrishna et al. (2021) designed an automatic classification model for detecting the drowsiness of the driver using wavelet packet transform [15]. The wavelet packet transform was extracted from the singlechannel Electro-Encephalogram (EEG) signals from the driver. The proposed model yields 94.45% of accuracy in performing the real-time sleep analysis. Taherisadr et al. (2018) designed a model for identifying the attention of the driver using Mel-Frequency Cepstrum in the two-dimensional transform and Convolution Neural Network (CNN) [16]. The designed model extracts the two-dimensional Mel-Frequency Cepstrum representation of the ElectroCardiogram (ECG) sensed from the driver. The analytical results yield that the designed model is more efficient than the existing methodologies of drowsiness detection during driving. Lee et al. (2017) have designed a system that performs correlation analysis of ElectroCardiography (ECG) and Photoplethysmogram (PPT) data for detecting the drowsiness of the driver [17]. This model is a noise replacement model and the experimental analysis proves that the Noise replacement model is better efficient than the PPT method of detecting the driver's drowsiness. Kumar et al. (2020) focused on the implementation of surveillance systems using embedded systems and signal processing tools [18]. The system concentrates on three factors namely detecting driver drowsiness, alcohol consumption, and crash detection for having better vehicle control. The experimental results show that this method is more efficient than the existing analog system with a high level of accuracy. Kowalczuk et al. (2019) proposed a diverse driver monitoring system by detecting the emotion of the driver [19]. The system identifies the internal and real emotions of the driver and the final emotion has been obtained using Kalman filter, in which the emotion is treated as a digital data. The system has no affinity towards the detection of fatigue of the driver. Li et al. (2020) proposed a system for analyzing facial expression and emotion correlations to detect urban crimes. The Facial Expression Recognition (FER) was designed and used to detect the emotion of the user using the facial expression and the results were compared with the Kernel Density Estimation (KDE) to reveal the relationship between the emotion and the driving pattern [20].  has compared the analysis of different image classification algorithms based on traditional machine learning and deep learning. The study has been carried-out both on very large dataset like MNIST dataset and small dataset like COREL1000 dataset. The experimental results show that traditional machine learning has a better effect on small datasets while deep learning has higher recognition accuracy on large datasets [21].
The proposed model is an integrated model, which detects the drowsiness of the driver and identifies the emotion of the driver to avoid reckless driving which is one of the vital causes of road accidents.

Proposed Model
The proposed model is composed of two modules namely detection of driver's fatigue and the emotion analysis of the driver to avoid reckless driving. The first module of detecting the drowsiness of the driver consists of three phases namely gathering the data from the driver sensing module, preprocessing the acquired data, and Deep Learning which is composed of Convolution Neural Network (CNN). The three phases and the functions were depicted in Fig. 2.

Data Gathering Phase
The data gathering phase is the initial training phase of the proposed model, in which the driver behavior is diversified into multi-level behavior or normal, fatigue, aggressive, disturbed, and alcohol consumption. The data gathering phase not only collects the information on driver's multi level behavior but also monitors the acceleration system of the vehicle based on the revolution per minute (RPM), speed, and throttle of the vehicle. The acceleration analysis of the vehicle is performed on linear acceleration  The acceleration and gravity are considered along the three dimensional axis (x, y and z) for determining the linear acceleration of the vehicle. The linear acceleration analysis measured from the vehicle determines the driving mode of the vehicle and classifies it under any of the multi levels such as normal, aggressive, drunken drive, reckless driving etc. A Driver Drowsiness Detection (DDD) dataset is used to train the driver drowsiness detection phase and the extended Cohn-Kanade dataset (CK+) is used to train the driver emotion analysis phase [22,23]. The datasets measured during the experimental training phase are used as reference values for live testing purposes. The captured image is processed to determine the eye blinking factor and its frequency under multi-level conditions like normal condition, fatigue, drunkenness and aggression. The collected values are stored in the local database as trained values so that they can be compared with the test values during the live measurements.

Pre-Processing Phase
The pre-processing phase is a necessary step that is performed prior to the core process of applying the data to the convolution neural networks. The measured raw data when applied directly to the convolution neural networks creates error in the output data and hence the preprocessing phase is considered to be a more vital process for processing the raw data into the acceptable format by the convolution neural networks. The preprocessing phase in the driver fatigue monitoring system measures the input image in time domain representation and has been labeled into multi-level of normal, aggressive, drunken and fatigue. The time interval considered in the proposed system is the most essential process and is selected to be 1.0 sec for detecting the distraction of the driver. This time interval is to be selected with care so as to avoid overlapping of input samples, which may lead to the loss of data. Similarly the time interval must be as less as possible with an aim to detect the minute level of distraction of the driver. The Recurrence Plot (RP) is employed to view the recurrent states and in the proposed model, the Recurrence Plot in the time series on temporary data which is measured to make digital images with spatial properties in the frequency domain. The mathematical expression for the Recurrence Plot is given as in Eq. (1).
Here the R a,b is the recurrence plot, while the R T is the recurrence threshold, whereas α is the Heaviside function of the temporal data. The algorithm for the time domain windowing and generating the recurrence plot is illustrated in the Tab. 1. x end Image = concatenate (Image X ) end In the proposed system, the captured image of the driver was converted from the time domain and the recurrence plotting is performed using the PyRQA toolbox for the analysis of recurrence quantification and to generate the recurrence plots in a massive parallel pattern. The plotted image was of 50 x 50 pixels dimensions stored in the grayscale format. The quantity of samples measured and the plotted images for each level were listed in Tab. 2.
The preprocessed input images with 50 X 50 pixels are sampled and reconstructed to 150 X 100 pixel images which are ready to feed as input to the adjacent stage of Convolution Neural Networks.

Deep Learning Phase
The deep learning phase is the final and core processing function of the proposed model for detecting the fatigue of the driver. The deep learning phase is composed of two processes namely feeding the preprocessed image to the convolution neural networks and finally the output replicates the driver's behavior and the output is classified under any of the aforementioned four levels. The motive for employing the Convolution Neural Networks (CNN) over the Neural Networks in the driver drowsiness monitoring system is that, the Neural Networks involve complex procedures to train the datasets and it requires all the datasets must be trained which is a time consuming and complex process. The Deep Neural Network accepts low level representations which were at the first level whereas the low levels were fused to high levels of representation at the final layer of the Deep Neural Network (DNN). The Convolution Neural Network (CNN) is a part of Artificial Neural Networks (ANN) which is employed in multi-diversity applications for its simpler process. The output image of the preprocessing phase with 150 X 100 is applied to all the channels of the convolution neural network and the majority of CNN layers extract features from the input image. The final layer of the CNN performs the maximum classification of the processed image and it classifies the observed image into any one of the states namely normal, fatigue, drunken and reckless. The architecture of the CNN in the proposed model is depicted in Fig. 3.  The proposed CNN model of the driver drowsiness detection system possesses two levels of convolution filters with an order (n, 2n). These two levels of combined convolution filters reduce the complexity of the model and also decreases the distraction detection time to 1.0 sec which leads to the rapid detection and alerting of the driver drowsiness detection. The algorithm for the training and testing of measured data using Convolution Neural Network is exemplified in Tab. 3.
The trained data were compared with the test data to classify the input data in any of the four levels classified under driver behavior detection. The preceding subsection describes the later module of the proposed system in detecting the emotion of the driver.

Emotion Detection System
The emotion detection system is based on Convolution Neural Network with a different set of processes and layers, accepting the pre-processing data from the previous module whose dimension is 150 X 100 pixels to determine the emotion of the driver. The input image is considered as the test image and is compared with the trained image to classify the emotion of the driver under multiple levels of normal, anger, disgust, fear, happiness, and sadness. The Convolution neural network for the process of emotion detection accepts the input image of 50 X 50 pixels and hence the preprocessed data of 150 X 100 pixels is normalized to convert into a digital image of 50 X 50 pixels. The reduction of image dimension is preceded by diagnosis of the driver's emotion by the simple onboard computer with convolution neural network concept. The entropy of the CNN is defined as mentioned in Eq. (2).
The layer 1 and 2 in the convolution network classifies the 50 X 50 pixel image and the final layer magnifies the image to fix the emotion under any of the aforementioned categories. During the training process of the data, the following augmentation properties were implemented.
Brightness range: 75% to 100% Rotation interval: ±2 degree Sheer range: ±2% Zoom transformation interval: ±2% The tested images are normalized to a determined case using the mathematical relationship as mentioned Eq. (3). n 0 ¼ n À n min n max À n min The final magnified output from the layer 3 of the convolution neural network is of 150 X 150 pixel dimension and based on the classified emotion of the driver, a prerecorded song is played using the controller to neutralize the mentality of the driver so as to avoid reckless driving.

Experimental Results
This section investigates the proposed model which comprises two modules and the experimental results were analyzed for its level of accuracy. The initial module of detecting the driver fatigue and other aforementioned status of the driver is classified using Convolution Neural Networks (CNN). The experimental result analysis was performed on the basis of accuracy level and the error rating in detecting the driver state. As an initial analysis, the accuracy level of Convolution Neural Network (CNN) is compared with the conventional classifiers of KNN classifier and SVM classifier and its derived classifiers. Fig. 4 illustrates the comparative analysis of CNN with other conventional classifiers in terms of accuracy percentage. CNN has scalable features for very large datasets and by the use of multiple convolution operations it classifies images efficiently. From Fig. 4, it is proved that the multi-layer CNN is more accurate in predicting the state of the driver and successfully classifying the multi-layer state of the driver. The accuracy level of the classifier improves by increasing the processing duration. The notable case in this driver drowsiness detection system is, it must possess a minimum duration to determine the distraction of the driver. By using two-levels, these models were able to extract features that identify the drowsy state of the driver. If three or more levels are chosen it would result in model overfitting and reduce the accuracy.  Fig. 6 is the qualitative analysis of the proposed model, in which the two level convolutional neural networks possess trained datasets for comparing the features with the testing data. From  Fig. 6, the training and testing data accuracy and matching precisely such that the proposed model yields an accuracy percentage of 93% in detecting the driver state and is classified under any of four normal, fatigue, drunken, and reckless.
The error analysis of the proposed model is shown in Fig. 7, which proves that the trained data possess reduced error than the testing data whereas the error rating were deteriorating to the level of null as the epoch level is exponentially increased. Fig. 8 represents the confusion matrix of the Convolution Neural Network for the six different emotion levels of the driver. The experimental results are well aligned among the predicted level and the test level which proves that the designed model works efficiently and detects the driver fatigue emotions with a high level of accuracy.

Conclusion
The Convolution Neural Network has become a notable technology in Machine Learning and Automation Systems due to its salient properties and features. The driver drowsiness detection system is a monitoring model, in which plentiful researchers are involved in reducing road accidents. Despite designing more efficient models, the road accidents continue to escalate rapidly due to the increase in distraction detection duration of the model. Hence to reduce the distraction detection duration and to increase the level of accuracy, the proposed model comprises two level convolution neural networks which can classify the driver behavior and the emotion in reduced detection duration. The experimental results prove that the proposed models are well aligned with the trained data and the error rate comparing the trained and test data and reducing with minimal marginal difference. The experimental analysis and comparative statements generate an accuracy level of 93% in detecting both the behavior and emotion of the driver. This system holds good for Automatic Driver Emotion Detection System (ADEDS) so that the road accidents and loss of valuable life will be considerably reduced in the upcoming days.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.