Difficulty in communicating and interacting with other people are mainly due to the neurological disorder called autism spectrum disorder (ASD) diseases. These diseases can affect the nerves at any stage of the human being in childhood, adolescence, and adulthood. ASD is known as a behavioral disease due to the appearances of symptoms over the first two years that continue until adulthood. Most of the studies prove that the early detection of ASD helps improve the behavioral characteristics of patients with ASD. The detection of ASD is a very challenging task among various researchers. Machine learning (ML) algorithms still act very intelligent by learning the complex data and predicting quality results. In this paper, ensemble ML techniques for the early detection of ASD are proposed. In this detection, the dataset is first processed using three ML algorithms such as sequential minimal optimization with support vector machine, Kohonen self-organizing neural network, and random forest algorithm. The prediction results of these ML algorithms (ensemble) further use the bagging concept called max voting to predict the final result. The accuracy, sensitivity, and specificity of the proposed system are calculated using confusion matrix. The proposed ensemble technique performs better than state-of-the art ML algorithms.
According to the author, autism spectrum disorder (ASD) is the condition when human beings have difficulties in interaction and communication. This miscommunication is due to negative influences in the nervous system of humans. The nervous system tends to affects the eyes, emotional hormones, and health of patients with autism. Symptoms and severity of ASD vary from one person to another. Most commonly identified symptoms are social communication, interactions, and obsessive, cyclic behaviors. In April 2020, 1 out of 54 children was recognized to be affected by ASD. This disorder begins at childhood and continues over adolescence to adulthood. Sometimes patients with ASD live quite independently, but sometimes few need lifelong special care. The symptoms experienced by a person with ASD can be further reduced by psychosocial evidence-based treatment and parent skill training programs. Beyond all treatment, first and foremost is early detection. This paper focuses on various machine learning (ML) algorithms to detect the early symptoms of ASD.
ASD is usually found in childhood at around 2 to 3 years [ No concentration on surrounding events Same words, names, and situations that may be repeated again and again Interaction among the peoples that is not normal No gestures and facial expressions during communication High sensitivity in their feeling when we touch and speak Voice that is rude and sounds high. Showing abnormal body postures to others
ASD may be caused by the gene of parents, another family person who is autistic, complicated deliveries, and missing vaccination in children. This paper analyzes the ensemble ML techniques for predicting ASD with a high accuracy at an early stage. The literature review presents various researchers’ detection techniques and the ML algorithms used in this area.
Abnormal human brain development leads to problems such as ASD. A person with ASD has extreme difficulty facing people and socially interacting with them. The person’s entire life is affected by ASD. To date, various researchers have identified genetic and environmental factors that cause ASD. If this syndrome is detected in early life, its effect can be reduced, but it cannot be fully cured. The major risk factors analyzed are low weight during birth, first baby has ASD and second baby also has possibilities, old age of parents, and late marriages. Patients with ASD have difficulties such as the following:
Giggling laugh and louder cry Not sensing pain in the body Totally missing eye contact Not showing wishes on anything Always liking to stay alone May attach to several inappropriate objects
Patients with ASD will not be interested in any constraints. They repeatedly show consistent conduct and behave as follows:
They tend to repeat the same words most of the time. They will feel if their schedule tends to change such as shifting home and leaving friends. They may remember small facts and numbers in their mind. They are less sensitive to noise, lights, and pains.
These symptoms cannot be cured but can be reduced by early-stage detection. The early detection and treatment of ASD will improve the quality of life of patients, but no medical test for detecting autism earlier is available. Judgment can only be made based on behavioral symptoms. ASD in adolescents will be recognized by their teachers and parents in school. The treatment for health in school will examine the symptoms and give suggestions. Then, the student will be sent to a doctor for study and examination of ASD. This process is very difficult because ASD is similar to mental health issues. This problem motivates us to initiate artificial intelligence (AI) technologies for detecting ASD earlier.
Recent advances in AI make researchers use ML to a greater extent in disease prediction. ASD is a highly increasing disorder that needs more scope through technologies for early prediction. In this technology, the pattern is trained with procedures such that an ML algorithm performs based on an observed pattern and returns the results [
Section 2 reviews the related literature. Section 3 explains different ML algorithms combined to perform an accurate early prediction of diseases. Section 4 evaluates the performance of the proposed result with existing techniques. Section 5 discusses the conclusion and future work.
This part explains previous research work on ASD prediction. Among all ML algorithms, the focus is on an algorithm used to predict diseases with a high accuracy.
The support vector machine (SVM) classifier is used in article [
Linear discriminant analyses and k-nearest neighbor (KNN) algorithm perform well in classifying the data with 90% accuracy. Instrument for ASD detection is used by Kosmicki et al. [
The mobile application-based image processing technique proposed in Bone et al. [
ASD is a neurological disorder disease or behavioral disease that entails lifelong interaction and communication problems for a person. The ASD problem of a person may start from toddler age and childhood, and continue to adolescence until adulthood. This disability is not curable but can be diagnosed at an early stage. This early detection can help the person improve the treatment. Various ML algorithms have been used to diagnose ASD. It can be identified at the early stage detection of 2 years of age based on symptoms. In the research direction of medical diagnosis of this disease, finding the best methods for early prediction of ASD is still in research.
Ensemble learning is the improvement of ML results combined with several ML algorithmic models results to provide a better prediction than single ML algorithms. The ensemble models can be divided into bagging and boosting.
In this proposed work, the early diagnosis of ASD is predicted using the ASD dataset such as Toddlers, Children, Adolescents, and Adults. The input dataset is divided into training and testing data. The proposed prediction is divided into four phases: preprocessing, FS, classification using ML algorithms, and ensemble learning for final prediction. The input raw dataset is preprocessed to remove the missing values, and the features are selected using bootstrapped gradient descent FS algorithm. The dataset with selected features are then classified using ensemble learning with three ML algorithms such as SMO–SVM, Kohonen self-organizing neural network (SONN), and RF. The prediction results of these ML algorithms are then subjected to ensemble learning using the bagging concept called max voting to predict the final result. The proposed methodology is depicted in
The datasets are collected from the UCI repository [
S. No. | Dataset name | Attribute type | No. of attributes | No. of instances |
---|---|---|---|---|
1. | Toddlers | Categorical, continuous, binary | 21 | 1054 |
2 | Children | Categorical, continuous, binary | 21 | 292 |
3 | Adolescents | Categorical, continuous, binary | 21 | 104 |
4 | Adults | Categorical, continuous, binary | 21 | 704 |
Attribute ID | Attribute name | Type | Description |
---|---|---|---|
1 | Age | Number | Toddlers (month), children, adolescents, and adults (year) |
2 | Gender | String | Male or female |
3 | Ethnicity | String | List of common ethnicities in text format |
4 | Born with jaundice | Boolean | Whether the case was born with jaundice |
5 | Family member with PDD | Boolean | Whether any immediate family member has PDD |
6 | Who is completing the test | String | Parent, self, caregiver, medical staff, and clinician |
7 | Country of residence | String | List of countries in text format |
8 | Used the screening application before | Boolean | Whether the user has used a screening application |
9 | Screening Method Type | Integer | Type of screening methods selected based on age category |
10 | Response of Q1 | Binary | Does your child look at you when you call his/her name? |
11 | Response of Q2 | Binary | How easy is it for you to get eye contact with your child? |
12 | Response of Q3 | Binary | Does your child point to indicate that s/he wants something? |
13 | Response of Q4 | Binary | Does your child point to share interest with you? |
Data preprocessing is important to transform the raw data into a meaningful, understandable format. The collected dataset from is preprocessed to remove the features with missing values that are not needed for further processing. The removed records are not needed for further processes such as FS and classification. This removal of unwanted features and missing values records improves classification accuracy [
Various FS algorithms such as correlation, gain ratio, and information gain are available for selecting the relevant features from the whole feature set. Our proposed work focuses on the use of ensemble learning in ASD prediction and uses ensemble bagging concept called bootstrap gradient boosting for selecting the relevant features for further classification. The main motivation of this technique is that gradient boosting performs well because they belong to ensemble ML class. In the gradient technique, ensembles are formed from the decision tree by adding one tree at time for ensemble purpose. It generally works like Ada boosting, but bootstrap aggregation is used to improve performance. Here, samples are randomly selected, and ensemble members are fitted. Bootstrap samples are independent of their process. These samples are eventually distributed with a low correlation between the input data samples. Gradient boosting uses a gradient descent algorithm to minimize the loss while adding input data models into the ensemble.
Step 1: The subsets of features are selected randomly from the preprocessed data.
Repetitions of features are eliminated, and identical subsets of features are generated.
Step 2: For i=1 to X (all training data samples)
Step 3: Gradient boosting improves each iteration t by creating a new subset that adds an estimator
called h to create a better subset model. h is represented as
Step 4: Weights are calculated after each iteration using
where d is the decreasing constant.
Step 5: Weight is updated as
Step 6: End for
Step 7: Features are ranked in descending order according to weight.
The features at the top have the highest weight and are selected as the relevant features. From the dataset taken for consideration, the attributes such as who completed the screening, age, gender, used the application before, country of residence, and screening score are considered not needed, and these features do not provide any usage for our analysis. These six attributes are removed, and the 14 remaining attributes are selected as relevant features for further processing using the proposed FS approach to improve prediction accuracy.
The best ML algorithms are combined for ensemble learning. Three ML approaches, namely, SMO–SVM, Kohonen SONN, and RF, are used for classifying the features with a high accuracy. These three algorithms perform better individually and combine to provide further better classification in ASD prediction. Features are classified in the ensemble, and these results are processed using bagging concept called max voting to predict the final result.
SVM is proven to be the best classification algorithm for all prediction problems. In this proposed work, sequential minimal optimization is used to train the SVM to improve the accuracy of classification result. The linear classification of SVM is represented in
Our problem is a binary classification, where the output is predicted as y=1 if f(x)>=0 and y=−1 if f(x)<0. The linear function is also improved with the kernel represented as
The kernel function
The optimal solution is based on the condition as represented in
The classification of SOM trained SVM is shown in
The Kohonen network is also called SONN, which is a computational method for analyzing high-dimensional data classification. The main objective of Kohonen SONN is to map the arbitrary dimension of input data into a discrete map comprised of neurons. This map is trained to organize data. While training the map, the location of the neuron is not changed where the weights differ based on the value. In self-organization, in the first phase of selection, each neuron has a small weight and input data. In the second phase, the neuron closest to the point is considered the winning neuron, and the neurons near the winning neuron also move toward the point. Euclidean distance is used to find the distance between the neuron and the point. The neuron with the least distance is the winning neuron. This process is repeated for all iterations, and the points are clustered. In this work, Kohonen network is used to determine with a high accuracy whether patients have ASD. The data item with n dimensional Euclidean vectors is represented in
where t is the index of the data item in the sequence. The ith model is declared as
The new value of
RF is a classifier based on a decision tree that classifies the data samples as many sub trees. Each tree provides the classification result. Each RF is a collection of decision tree in the form of
The probability of the prediction of each subset is represented as
Step 1: Preprocess: The input dataset is preprocessed using Section 3.2.
Step 2: FS: The preprocessed data are then given as input to FS called
bootstrapped gradient boosting presented in Section 3.3 using Algorithm 1.
Step 3: ML: The data with selected features are then given as input to classification
algorithms such as SOM–SVM, Kohonen SONN, and RF
Step 4: Ensemble Learning: The prediction results of each algorithm are calculated separately. The
maximum result is the final result of the proposed ASD prediction.
Max(output (SMO–SVM), output(Kohonen SONN), output (RF))
Step 5: Output: Return the prediction result.
The proposed ensemble-based ASD prediction system accurately predicts ASD in patients using the proposed series of processes such as preprocessing, FS, and classification. The preprocessing of the input data fills the missing values and removes the redundant records. FS determines the relevant features and classification using the best classifiers that accurately categorize the data. Hence, the proposed ASD prediction system is a complete system that involves all the process steps for a complete prediction system. Each step of the proposed process improves prediction accuracy compared with other existing approaches.
The proposed ensemble-based ASD prediction system is evaluated with the dataset mentioned in Section 3.1 in terms of accuracy, sensitivity, and specificity by using confusion matrix and classification report.
To determine the performance, the classification model with the target and performance measurement is an important process. Metrics are used to evaluate the efficiency and effectiveness of the proposed classification model using the test dataset. In this work, the performance of the proposed model is evaluated by metrics such as accuracy, sensitivity, and specificity using equations with confusion matrix elements, as shown in
Actual ASD values | Predicted ASD values | |
---|---|---|
ASD | No ASD | |
ASD | True positive (TP) | False positive (FP) |
No ASD | False Negative (FN) | True Negative (TN) |
The experimental results of the proposed system use 14 attributes from the FS method and evaluate the accuracy of the classification using the three algorithms with max voting. The three algorithms are evaluated separately to assess the performance of the proposed ensemble model. These algorithms are implemented with the four datasets such as Toddlers, Children, Adolescents, and Adults. To understand the importance of FS algorithm in classification,
Classifiers | Total number of features | Accuracy before FS | Reduced features | Accuracy after FS |
---|---|---|---|---|
SVM | 21 | 91.2 | 15 | 93.42 |
SMO–SVM | 21 | 93 | 15 | 94.6 |
Kohonen | 21 | 89 | 15 | 91 |
RF | 21 | 92 | 15 | 94.37 |
Proposed ASD prediction system | 21 | 94.78 | 15 | 99.54 |
The evaluated result of this
Toddlers | Children | Adolescents | Adults | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Classifiers | SN | SP | Acc | SN | SP | Acc | SN | SP | Acc | SN | SP | Acc |
SVM | 96 | 95.75 | 96.79 | 92.3 | 93.23 | 94.23 | 90.32 | 92.32 | 95.3 | 92.43 | 89.34 | 94.38 |
SMO–SVM | 88.8 | 92.3 | 96.22 | 89.23 | 93.23 | 95.98 | 91.43 | 94.23 | 96.3 | 92.88 | 95.38 | 97.37 |
Kohonen | 87.23 | 93.42 | 95.32 | 86.34 | 92.23 | 94.26 | 88.34 | 93.54 | 95.3 | 89.9 | 94.08 | 96.89 |
RF | 91.23 | 90.23 | 93.42 | 92.34 | 91.78 | 94.27 | 93 | 92.4 | 96.98 | 94.87 | 94.03 | 96.98 |
Proposed ASD Prediction system | 99.33 | 98.23 | 99.64 | 99.12 | 98.65 | 99.12 | 99.01 | 99.34 | 99.78 | 98.98 | 99.45 | 99.8 |
The performance evaluation of various classifiers on four datasets obtains an accuracy in the range of (86%–99.8%). For the evaluation of the Toddlers dataset, our proposed approach obtains 99.64% accuracy that is higher than other algorithms. The next best algorithm is SVM, and the least percentage of accuracy is obtained by the RF approach. The results are illustrated in
For the evaluation of the Children dataset, our proposed classification approach obtains 99.12% accuracy that is higher than other algorithms. The next best method is SMO–SVM, and the least accuracy is obtained by SVM. The results are illustrated in
For the evaluation of the Adolescents dataset, our proposed classification approach obtains 99.78% accuracy that is higher than other algorithms. The next best method is RF, and the least accuracy is obtained by SMO and Kohonen. The results are illustrated in
For the evaluation of the Adults dataset, our proposed classification approach obtains 99.8% accuracy that is higher than other algorithms. The next best method is SMO–SVM, and the least accuracy is obtained by SVM. The results are illustrated in
The overall performance of our proposed ASD prediction system using ensemble are 99.64% accuracy for the Toddlers dataset, 99.12% accuracy for the Children dataset, 99.78% of accuracy for the Adolescents dataset, and 99.8% accuracy for the Adults dataset. For all four datasets, the proposed model obtains an effective accuracy, and our proposed approach obtains a high accuracy on the Adults dataset.
During the classification, different types of errors can be observed. Those errors can be expressed as deviation, mean absolute error (MAE), and root mean square error (RMSE), and represent differences between the predicted and observed data. The best classification technique has the lease amount of error. In this proposed work, MAE, RMSE, and relative absolute error (RAE) are calculated using the following equations [
MAE: It is the average of the test samples that is the difference between prediction and actual observation.
RMSE: It is the square root of the average squared differences of prediction and actual observation.
RAE: It is the mean ratio produced by a trivial or base model using the equation
Classification errors of the proposed model include MAE of 0.04, RMSE of 0.02, and RAE of 0.12, which is the minimum among all the existing classifiers. For the Adolescents dataset, the classification errors of the proposed model include MAE of 0.05, RMSE of 0.02, and RAE of 0.12, which is the minimum among all the existing classifiers. For the Adults dataset, the classification errors of the proposed model include MAE of 0.04, RMSE of 0.01, and RAE of 0.11, which is the minimum among all the existing classifiers. Hence, our proposed ensemble-based ASD prediction system is the best in terms of high accuracy and low error, which obtains a high accuracy on all four datasets and the minimum error rate for all the four datasets.
Autism disorder is considered a substantial problem that is difficult to predict and prevent. Having a child with this serious disorder has become challenging for the family. The early diagnosis of ASD is very important for small life. Today, research highly focuses on improving the early prediction and accuracy of autism disorder. This proposed technique achieves 99% of accuracy with an error of 0.02, which is the highest among all other existing ML techniques. The success of ML enables combining the best ML algorithms as ensemble learning and performing a faster prediction in this article. The output of the ensemble learning model is processed with bagging concept. Max voting in bagging concept predicts accurately compared with other prediction algorithms. The future scope of the paper can be improved by collecting all data of patients such as brain MRI, face recognition, body posture, and patient response model to improve prediction accuracy.