Autism Spectrum Disorder Prediction by an Explainable Deep Learning Approach

Anupam Garg; Anshu Parashar; Dipto Barman; Sahil Jain; Divya Singhal; Mehedi Masud; Mohamed Abouhawwash

doi:10.32604/cmc.2022.022170

[BACK]

Computers, Materials & Continua DOI:10.32604/cmc.2022.022170
Article

Autism Spectrum Disorder Prediction by an Explainable Deep Learning Approach

Anupam Garg1, Anshu Parashar1, Dipto Barman2, Sahil Jain3, Divya Singhal3, Mehedi Masud4 and Mohamed Abouhawwash5,6,*

1Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
2School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
3University Institute of Biotechnology, Chandigarh University, Mohali, India
4Department of Computer Science, College of Computers and Information Technology, Taif University, Taif, 21944, Saudi Arabia
5Department of Mathematics, Faculty of Science, Mansoura University, Mansoura, 35516, Egypt
6Department of Computational Mathematics, Science, and Engineering (CMSE), Michigan State University, East Lansing, MI, 48824, USA
*Corresponding Author: Mohamed Abouhawwash. Email: abouhaww@msu.edu
Received: 29 July 2021; Accepted: 09 September 2021

Abstract: Autism Spectrum Disorder (ASD) is a developmental disorder whose symptoms become noticeable in early years of the age though it can be present in any age group. ASD is a mental disorder which affects the communicational, social and non-verbal behaviors. It cannot be cured completely but can be reduced if detected early. An early diagnosis is hampered by the variation and severity of ASD symptoms as well as having symptoms commonly seen in other mental disorders as well. Nowadays, with the emergence of deep learning approaches in various fields, medical experts can be assisted in early diagnosis of ASD. It is very difficult for a practitioner to identify and concentrate on the major feature's leading to the accurate prediction of the ASD and this arises the need for having an automated approach. Also, presence of different symptoms of ASD traits amongst toddlers directs to the creation of a large feature dataset. In this study, we propose a hybrid approach comprising of both, deep learning and Explainable Artificial Intelligence (XAI) to find the most contributing features for the early and precise prediction of ASD. The proposed framework gives more accurate prediction along with the recommendations of predicted results which will be a vital aid clinically for better and early prediction of ASD traits amongst toddlers.

Keywords: Deep learning; explainable artificial intelligence; autism spectrum disorder; machine learning

1 Introduction

Autism Spectrum Disorder (ASD) is a behavioral disease that affects the tendency of reciprocity of individuals with society throughout their lifetime. The symptoms of ASD are exhibited during childhood and persist till adolescence and adulthood [1]. An ASD patient is characterized by repetitive behaviors and an analysis of such behaviors shall help in the development of an early detection approach. The diversity of behaviors demonstrated by ASD patients depends significantly on two factors, i.e., age and ability. The behavioral disorders commonly present in an ASD patient are deficient expressive gestures, non-responsiveness to sound, lack of proper eye contact, no sensation of pain, repetition of words and getting agitated with a change in daily chores [1]. As compared to healthy populations, siblings with autism are at fifty times greater risk of suffering with ASD [2]. Also, males are 4–5 times more likely to be affected with ASD as compared to females. WHO reported that 1 in 160 children is prone to developing ASD at any given time worldwide. Hong Kong, South Korea and United States are the top 3 countries having the highest prevalence rate of ASD. In India, the prevalence rate is 1 in 500 and incidence rate is 11,914 people every year. In recent years, the ASD prevalence has increased from 15 to 64 per 10,000 in India. According to the Autism Society of America, the autism incidence rate is rising at a rate of 10–17% each year in USA. In 2020, 1 in 54 children in USA was detected with ASD which is a 10% rise according to the new CDC (Centers of Disease Control and Prevention) report. Overall, it can be said that the statistics of ASD prevalence are alarming considering it is one of the rarest diseases.

Though ASD cannot be fully cured, early detection of ASD symptoms can help in reducing the effects of this disease. With the application of machine learning (ML) in the prediction and detection of various diseases with good accuracy, a ray of hope for early detection of ASD based on various physical and physiological parameters is presented. Detection and analysis of ASD is quite challenging due to the existence of other mental health problems with common symptoms which results in cases with false detection. However, if the features of a model based on ML could explain why it predicts ASD, medical professionals could make a better decision during early diagnosis. This was the motivation for our work as early detection of ASD will help in decreasing the effects of symptoms with proper and timely treatment which shall ultimately improve the lifestyle of patients and their family members.

The major contributions through this paper are:

• A deep learning model is proposed for the Autism Spectrum Disorder (ASD) prediction at an early stage through the screening test dataset.

• The Explainable Artificial Intelligence (XAI) is applied to identify the contribution of features towards accurate prediction.

• Comparative case studies have been conducted using available features in the dataset and most contributing features predicted by XAI through performance metrics, namely, accuracy, precision, recall and F1-score.

The paper is formulated into 4 sections. The background and brief discussion about the technologies used is presented in Section 2. It also specifies the proposed framework along with the characteristics of the dataset used for the experimentation. Experimental results and performance metrics are presented and discussed in Section 3. Finally, Section 4 presents the conclusion and future perspectives.

2 Materials and Methods

In this section, we have discussed about background and the proposed deep learning (DL)/XAI based framework for the classification of patients as ASD victims or non-ASD victims. The DL Layered training model illustrated in Figs. 1 and 2 are explained in Section 2.2. Further, the experimental details have been given in Section 2.2 including state-of-art data set used, DL/XAI results and major findings.

2.1 Background

Over the years, many researches have their center of attention on the analysis of the impact of early ASD detection by different ML algorithms. Raj et al. [1] explored various ML algorithms on three different, non-clinically public available datasets for ASD prognosis. On analyzing the impact of the performance of various ML algorithms, it was observed that the Convolutional Neural Network (CNN) performed best providing the highest accuracy of 99.53%, 98.30% and 96.88% for prediction of ASD in adults, children and adolescents respectively. Akter et al. [3] also applied various ML and ensembled algorithms on the datasets available from ML repositories (Kaggle and UCI) for more precise detection of autism. The experiments indicated at the outperformance of logistic regression by achieving the accuracy of 95.19%. ASD sometimes leads to behaviors such as repetitive and periodic self-hitting and head banging which are self-injurious causing the autism afflicted child to be hospitalized. Cantin-Garside et al. [4] analyzed the ML algorithms to detect the divergent self-injurious behavior (SIB) types for a smart SIB monitoring system. The k-nearest neighbor and support vector machine provided the highest accuracies of 99.1% and 94.6% for individuals and group respectively. Liu et al. [5] explored facial patterns in children and found that it is one of the potential features to detect ASD in children, providing a classification accuracy of 88.51%. A review was conducted by Hyde et al. [6] to help researchers in addressing this problem further using the approaches that are analytically and statistically comprehensive.

ASD is detected clinically via screening tests which are expensive and time-inefficient. So, in order to reduce time consumption, Omar et al. [7] presented a Hybrid ML based prediction framework for individuals of any age category suffering from ASD. The proposed hybrid approach consisting of Random Forest-CART (Classification and Regression Tree) and Random Forest- ID3 (Iterative Dichotomiser−3) was evaluated on AQ10 and 250 clinical dataset providing a prediction accuracy of 92.26%, 93.78% and 97.10% for children, adolescents and adults respectively. To analyze the role of upper limb movement in the detection of ASD, Crippa et al. [8] experimented the Support Vector Machine (SVM) on dataset collected upon surveying 15 pre-school toddlers with ASD and 15 typically suffered adolescents with the process of kinematic analysis providing 96.7% of classification accuracy. Sadouk et al. [9] and Rad et al. [10] proposed a CNN trained to analyze SMM (stereotypical motor movements) present in ASD patients which outperformed the traditional approaches. Nasser et al. [11] specified the use of an ANN model for ASD detection from the data collected from an ASD screening app providing great accuracy. Sadiq et al. [12] explored the linguistic patterns of 33 children with ASD while having multiple conversations with a medical specialist. They experimented the LSTM (Longshort term memory) networks and speaker diarization patterns which provided a considerable upsurge in the R2 performance metric. However, there is a long way to go in designing a definite approach to achieve satisfactory and consistently reproducible results for greater translucency and comprehensibility.

Since the 90s, multi-class classification has been a challenge in the ML field. Due to the complexity involved in the screening tests and various parameters and hidden factors involved in ASD diagnosis, accurate prediction of ASD has not been easy. The data-driven based DL approaches have a great ability to explore the hidden characteristics in the available features of the dataset.

2.2 Explainable Deep Learning Framework for Prediction of Autism Spectrum Disorder

The framework for the explainable deep learning-based Classification of ASD is illustrated in Fig. 1. Further, the 4-Layer Deep Learning model has been built and trained using ASD state-of-the-art datasets as illustrated in Fig. 2.

images

Figure 1: Framework for the prediction of ASD using deep learning along with the recommendation of important features besides their ranks

The details of framework including data set, model building and training have been discussed below.

2.2.1 Dataset and Feature Selection

In this work, ASD screening dataset has been used for model building, training and empirical evaluation. The two separate datasets are considered from an open-access database of nearly 1758 children/toddlers recorded and given by Fadi Thabtah [13–15]. Datasets consist of 21 features and we have selected 15 relevant features (input−14, output−01). The detailed description of data set features used in this study is mentioned in Fig. 2. This ASD screening dataset of toddlers consists of influential features to be utilized further for enhancing the prediction of ASD cases and for determining autistic traits [16–19]. Fadi Thabtah recorded 10 features (A1-A10) based on individual's behavior and other characteristics for detecting the ASD cases efficiently from controls in behavior science.

images

Figure 2: Feature description of ASD screening dataset [16–19]

We have considered this specific dataset and selected the appropriate features because of its primarily usage by several researchers for their ASD studies as well as the most promising and mandatory features should be considered to classification [18,19]. Further, the explainable approach has been applied in order to select the most promising features contributing toward the prediction results.

2.2.2 Deep Learning Based Classification Model Building and Training

The ASD screening data of the toddlers (Fig. 2) is taken as input features for training and testing of the DL model and the Explainable AI module. The entire feature set is used namely A1-A10, age, sex, born with jaundice, family members with ASD history and output feature/class (ASD Traits Y/N). During experimentations, we have used all above mentioned features and most impactful features computed by XAI for giving findings and recommendations as per the results obtained from the DL model. Further, the available dataset has been partitioned for training and testing purposes and details for the same has been mentioned in the experimental results and discussion section. After extracting the relevant features, the feature set is given as input to 4-layer Deep Learning Model as illustrated in Fig. 3.

Here, 4 hidden layers are deployed to classify the patients as ASD victims and non-ASD victims. An Auto-encoder multi-layer DL neural network has been applied which encodes the input data through back propagation which reduces the inconsistency between the input and reformation. Stack auto-encoders is based on the layered learning network in which output of one hidden layer is send as input to the adjacent hidden layer and this process is followed until the network is trained completely. There are two parts of auto-encoder training, namely, encoder and decoder. The input layer is comprised of encoder and output layer comprised of decoder. Encoder is used for converting the input data to hidden presentation whereas decoder retransform the hidden presentation to input data. For the performance of the neural network, activation functions are the foremost and main decision-making units. This makes it more critical for the selection of the relevant activation unit. Over the years, numerous activation units have been formulated with the range of properties those are essential for efficient learning of the model. Here, for this model, we have applied Basic Rectified Linear Unit (ReLU), Hyperbolic Tangent Activation Function (Tanh) and Sigmoid activation functions in sequence and details of these activation functions are as mentioned below:

images

Figure 3: Deep Learning model for ASD traits classification

Basic Rectified Linear Unit (ReLU): It is the type of rectified linear unit activation function known as ramp function that sum up the weighted inputs from the node to the stern output. This function output the positive input directly or output is zero. It analogizes the half-wave rectification. The mathematical representation is given as:

f(x)=x+=max(0,x)(1)

ReLU trains the deeper networks more efficiently as compared to sigmoid or tanh and is six times faster. Due to swift forward and backward propagation steps, the result calculation is easy. ReLU is best for the classification problems using multiple convolutional layers.

Tanh: One of the drawbacks of using only the sigmoid activation function is that the neural network can be stuck on the edge value, whose alternative is the application of hyperbolic function known as tanh function. It is also a type of “S” shaped activation function which behaves as an extended sigmoid function curve when the range of the output values is in between −1 to 1. Therefore, the negative input values mapped to negative output and those which are closer to zero will be mapped to nearest zero output values of hyperbolic function, causing not to halt the network during training. The mathematical representation of tanh function is:

f(x)=tanh(x)=ex−e−xex+e−x(2)

The tanh function derivative finds the global or local minima in big data leading to minimize the cost function faster.

Sigmoid Activation Function: It is a type of “S” shaped activation function because of its non-linear output that satisfies the features of sigmoid curve. The equation of the sigmoid function for deep learning model is as below:

Y=σ(x)=11+e−x(3)

The above equation depicts the considerable change in Y values for a small change in the value of X in the range of −2 to +2, which takes the Y values to one of the two ends of the curve leading to the boosting of the classifier for making more clear predictions. It is the most extensively used activation function in research for deep learning; however, it is a not a perfect function and has certain drawbacks.

2.2.3 XAI for Measuring ASD Features Influence

In order to transform the black-box deep learning model based on the layered architecture into white-box so that arrival of model to a particular decision can be explainable, a number of methods are available such as SHAP [18], LIME [19] and CIU. The model is trained on the training dataset which provides predictions for the testing data. Here, we have applied the concept of XAI for measuring ASD features influence towards the classification accuracy of the model. The XAI architectures use both the training datasets and the model for the explanation of the prediction made for the testing dataset. Further, the most influential features have been selected and consequently used for model training and testing as discussed in the next section.

3 Experimental Results and Discussion

Here, we have conducted 4 case studies based on the different dataset and selected features. Tab. 1 presents the detailed information about the experimental dataset as well as experimental setup used for each empirical case study.

images

During the experimentation, the following performance parameters were computed to evaluate the proposed model.

Accuracy: The ratio of successfully categorized data to total data.

Accuracy=TN+TPFP+TN+TP+FN(4)

Recall (Sensitivity): It gives the number of patients accurately identified of having ASD.

Recall=TPTP+FN(5)

Precision: The ratio of patients correctly identified from ASD out of all the patients those are actually suffering from ASD.

Precision=TPFP+TP(6)

F-measure (F-score/F1-score): It represents the harmonic-mean of sensitivity and precision expressing the overall success.

F1-Score=21Recall+1Precision(7)

The Tab. 2 depicts the obtained results for three case studies. Here, accuracy, precision, recall and F-measure are computed to appraise the performance of proposed approach.

images

The proposed model predicts the ASD class with 0.79, 0.98, 0.86 accuracy respectively for the case study-I, case study-II and case study-III respectively. Here, the model has achieved maximum accuracy 0.98 for dataset 2. Further, the model achieved 0.99, 0.97, 0.98 values for precision, recall and F-measure for the dataset−2 as compare to the others.

Figs. 4a–4c illustrates the ROC curve and values as 0.93. 0.81, 1 obtained for the case study-I, case study-II and case study-III respectively.

images

Figure 4: Graphical representation of the ROC performance metrics (a) For the case study I (b) For the case study II (c) For the case study III

Further, SHAP (SHapley Additive exPlanations) has been applied to obtain the most influential features. SHAP is a blend architecture for deciphering the predictions by ranking the most important feature required most for attaining an accurate decision. SHAP includes a new class that contains feature importance measures and a solution which is achieved with the use of the most appropriate features. The proficiency of any ML model is described by SHAP using the shapely values and the fairness property of game theory. SHAP imparts both local and global explanations for a model. The explainable model g(x′) with an input f(x) for an original model x′ and input variables for a model x=(x1,x2,…xn) is given as Eq. (8):

f(x)=g(x′)=φ0+∑i=1n⁡Φixi′(8)

where n is the number of input features and φ0is the constant value when all input values are missing.

images

Figure 5: SHAP values showing the impact of all the features based on XAI

images

Figure 6: Features contribution to the accuracy of the model based on XAI

Figs. 5 and 6 indicate the contribution of the different features towards the classification accuracy. From these results, it has been observed that features namely F3, F4, F5, F6, F7, F9 and F10 influence the prediction more in comparison to other remaining features. The XAI results markedly indicate and recommend the features or symptoms (presented in Tab. 3) which are highly co-related to prediction of the ASD traits. Further, during case study-IV, we have used the shortlisted features (Tab. 3) and applied the proposed model. Tab. 4 indicates that these features contribute nearly 79% in the overall prediction accuracy. Tab. 5 also depicts the comparison of the proposed framework with some state-of-the-art existing works. Further, other studies [20,21] also experimented machine learning on brain images and video ASD data sets but in contrast we have worked on patients symptoms dataset used to diagnose ASD traits.

images

4 Conclusion

The proposed framework is an amalgamation of two recent used approaches, i.e., deep learning and XAI for the early prediction of ASD. The focus of this paper is the accurate prediction of ASD amongst toddlers along with the recommendation of the features contributing the most in accuracy. The 4-layer deep learning model has been implemented to predict ASD on the screening test dataset for the toddlers. This model has been evaluated for three case studies based on performance metrics, namely, accuracy, precision, recall and F1-score. Among the three case studies, two are based on individual dataset and third case study is based on combining the two datasets. It has been observed that for case study-II, the highest accuracy achieved is 98% with the 0.99 value of precision parameter, 0.97 for recall metric and 0.98 for F1-score. The XAI is applied using the SHAP model to rank the important features to be focused by the medical experts that contribute in the accurate prediction. 79% of the total accuracy is achieved by the first 7 ranked features. For validation purposes, our methodology can be evaluated on different clinical datasets.

Acknowledgement: Authors are grateful to all who contributed to this article including editors, anonymous reviewers, and for the support of Taif University Researchers Supporting Project Number (TURSP−2020/10), Taif University, Taif, Saudi Arabia.

Funding Statement: Authors would like to thank for the support of Taif University Researchers Supporting Project Number (TURSP−2020/10), Taif University, Taif, Saudi Arabia.

Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.

References

1. S. Raj and S. Masood, “Analysis and detection of autism spectrum disorder using machine learning techniques,” Procedia Computer Science, vol. 167, pp. 994–1004, 2020. [Google Scholar]

2. U. Frith and F. Happé, “Autism spectrum disorder,” Current Biology, vol. 15, no. 19, pp. R786–R790, 2005. [Google Scholar]

3. T. Akter, M. I. Khan, M. H. Ali, M. S. Satu, M. J. Uddin et al., “Improved machine learning based classification model for early autism detection,” in 2021 2nd Int. Conf. on Robotics, Electrical and Signal Processing Techniques (ICRESTDhaka, Bangladesh, pp. 742–747, 2021. [Google Scholar]

4. K. D. Cantin-Garside, Z. Kong, S. W. White, L. Antezana, S. Kim et al., “Detecting and classifying self-injurious behavior in autism spectrum disorder using machine learning techniques,” Journal of Autism and Developmental Disorders, vol. 50, no. 11, pp. 4039–4052, 2020. [Google Scholar]

5. W. Liu, M. Li and L. Yi, “Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework,” Autism Research, vol. 9, no. 8, pp. 888–898, 2016. [Google Scholar]

6. K. K. Hyde, M. N. Novack, N. LaHaye, C. Parlett-Pelleriti, R. Anden et al., “Applications of supervised machine learning in autism spectrum disorder research: A review,” Review Journal of Autism and Developmental Disorders, vol. 6, no. 2, pp. 128–146, 2019. [Google Scholar]

7. K. S. Omar, P. Mondal, N. S. Khan, M. R. K. Rizvi and M. N. Islam, “A machine learning approach to predict autism spectrum disorder,” in 2019 Int. Conf. on Electrical, Computer and Communication Engineering (ECCECox's Bazar, Bangladesh, pp. 1–6, 2019. [Google Scholar]

8. A. Crippa, C. Salvatore, P. Perego, S. Forti, M. Nobile et al., “Use of machine learning to identify children with autism and their motor abnormalities,” Journal of Autism and Developmental Disorders, vol. 45, no. 7, pp. 2146–2156, 2015. [Google Scholar]

9. L. Sadouk, T. Gadi and E. H. Essoufi, “A novel deep learning approach for recognizing stereotypical motor movements within and across subjects on the autism spectrum disorder,” Computational Intelligence and Neuroscience, eCollection, vol. 2018, no. 3, pp. 1–16, Article ID 7186762, 2018. https://doi.org/10.1155/2018/7186762. [Google Scholar]

10. N. M. Rad and C. Furlanello, “Applying deep learning to stereotypical motor movement detection in autism spectrum disorders.” in 2016 IEEE 16th Int. Conf. on Data Mining Workshops (ICDMWBarcelona, Spain, pp. 1235–1242, 2016. [Google Scholar]

11. I. M. Nasser, M. O. Al-Shawwa and S. S. Abu-Naser, “Artificial neural network for diagnose autism spectrum disorder,” International Journal of Academic Information Systems Research, vol. 3, no. 2, pp. 27–32, 2019. [Google Scholar]

12. S. Sadiq, M. Castellanos, J. Moffitt, M. Shyu, L. Perry et al., “Deep learning based multimedia data mining for autism spectrum disorder (ASD) diagnosis.” in 2019 IEEE Int. Conf. on Data Mining Workshops (ICDMWBeijing, China, pp. 847–854, 2019. [Google Scholar]

13. F. Thabtah, “Autism spectrum disorder screening: machine learning adaptation and DSM-5 fulfillment,” in ICMHI ‘17: Proc. of the 1st Int. Conf. on Medical and Health Informatics 2017, Taichung, Taiwan, pp. 1–6,. 2017. [Google Scholar]

14. F. Thabtah, “Machine learning in autistic spectrum disorder behavioral research: A review and ways forward,” Informatics for Health and Social Care, vol. 44, no. 3, pp. 278–297, 2019. [Google Scholar]

15. F. Thabtah, F. Kamalov and K. Rajab, “A new computational intelligence approach to detect autistic features for autism screening,” International Journal of Medical Informatics, vol. 117, pp. 112–124, 2018. [Google Scholar]

16. K. S. Ramana, M. S. Lakshmi and M. Janardhan, “Machine learning based novel autism spectrum disorder screening,” Turkish Journal of Computer and Mathematics Education, vol. 12, no. 3, pp. 4866–4879, 2021. [Google Scholar]

17. H. S. Alarifi and G. S. Young, “Using multiple machine learning algorithms to predict autism in children,” in Proc. of Int. Conf. Artificial Intelligence, New York, NY, United States, pp. 464–467, 2018. [Google Scholar]

18. S. M. Lundberg and S. Lee, “A unified approach to interpreting model predictions,” in NIPS'17: Proc. of the 31st Int. Conf. on Neural Information Processing Systems, New York, United States, pp. 4768–4777, 2017. [Google Scholar]

19. M. T. Ribeiro, S. Singh and C. Guestrin, “Why should i trust you?” explaining the predictions of any classifier,” in KDD ‘16: Proc. of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, New York, United States, pp. 1135–1144, 2016. [Google Scholar]

20. H. S. Nogay and H. Adeli, “Machine learning (ML) for the diagnosis of autism spectrum disorder (ASD) using brain imaging:,” Reviews in the Neurosciences, vol. 31, no. 8, pp. 825–841, 2020. [Google Scholar]

21. E. Leblanc, P. Washington, M. Varma, K. Dunlap, Y. Penev et al., “Feature replacement methods enable reliable home video analysis for machine learning detection of autism,” Scientific Report, vol. 10, no. 21245, pp. 1–11, 2020. [Google Scholar]

This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.