Human gait recognition (HGR) has received a lot of attention in the last decade as an alternative biometric technique. The main challenges in gait recognition are the change in in-person view angle and covariant factors. The major covariant factors are walking while carrying a bag and walking while wearing a coat. Deep learning is a new machine learning technique that is gaining popularity. Many techniques for HGR based on deep learning are presented in the literature. The requirement of an efficient framework is always required for correct and quick gait recognition. We proposed a fully automated deep learning and improved ant colony optimization (IACO) framework for HGR using video sequences in this work. The proposed framework consists of four primary steps. In the first step, the database is normalized in a video frame. In the second step, two pre-trained models named ResNet101 and InceptionV3 are selected and modified according to the dataset's nature. After that, we trained both modified models using transfer learning and extracted the features. The IACO algorithm is used to improve the extracted features. IACO is used to select the best features, which are then passed to the Cubic SVM for final classification. The cubic SVM employs a multiclass method. The experiment was carried out on three angles (0, 18, and 180) of the CASIA B dataset, and the accuracy was 95.2, 93.9, and 98.2 percent, respectively. A comparison with existing techniques is also performed, and the proposed method outperforms in terms of accuracy and computational time.
Human identification using biometric techniques has become the most important issue in recent years [
The model-based approach directs human movement based on prior knowledge [
Mehmood et al. [
Based on these studies, we consider the following challenges of this work: I change in human view angle; ii) change in a human wearing condition such as clothes, etc.; iii) change of human characteristics during walking styles such as slow walk, fast walk, etc.; iv) deep learning model requires a large amount of data to train a good model, but it is not always possible to obtain data due to various factors. We proposed a new deep learning and Improved Ant Colony Optimization framework for accurate HGR to address these issues.
In terms of fully connected layers, modified two pre-trained models, VGG16 and ResNet101, and added a new layer with the connection of the preceding layers. A features selection technique is proposed name improved ant colony optimization (IACO). In this approach, features are initially selected using ACO and then refined using an activation function based on the mean, standard deviation, and variance. Used the IACO on both modified deep learning models to compare accuracy. The best one is considered for the final classification based on accuracy.
This section describes the proposed human gait recognition method.
CASIA B [
Deep learning demonstrated massive success in the classification phase of machine learning [
Suppose we have some
ReLU layer is an activation layer used for the problem of non-linearity among layers. Through this layer, the negative features are converted into zero values. Mathematically, it is defined as follows:
The batch normalization is achieved through the normalization step that fixes each of the inputs layer's means and variances. Idyllically, the normalization will be conducted on the entire training set. Mathematically, it is formulated as follows:
The pooling layer is normally applied after the convolution layer to reduce the spatial size of the input. It is applied individually to each depth slice of an input volume. The volume depth is always conserved in pooling operations. Consider, we have an input volume of the width
The average pool layer calculates the average value for each patch on a feature map. Mathematically, it is formulated as follows:
Neurons in the fully connected layer (FC) have full connections to all the activations in the previous layer. The activations can later be computed with the matrix multiplication followed by the bias offset. Finally, the output of this layer is classified using Softmax classifier for the final classification. Mathematically, this function is defined as follows:
In the literature, several models are introduced for classification, such as ResNet, VGG, GoogleNet, InceptionV3, and named a few more [
ResNet represents the residual network, and it has a significant part in computer vision issues. ResNet101 [
Consider
Visually, this process is illustrated in
This network consists of 48 layers and is trained on the 1000 object classes [
Optimal feature selection is an important research area in pattern recognition [
Here, every feature location is given as
Here,
Here, η (0 < η < 1) shows the ratio of loss of pheromones. A new value of pheromones is obtained after every iteration. Mathematically, this process is formulated as follow:
Here,
Here,
The experimental process such as experimental setup, dataset, evaluation measures, and results is discussed in this section. The CASIA B dataset is utilized in this work and divide into 70:30. Its means that 70% dataset is used for the training purpose and the remaining 30% data for testing. During the training process, we initialized epoch's 100, iterations 300; mini-batch size is 64 and learning rate 0.0001. For learning, the Stochastic Gradient Descent (SGD) optimizer is employed. For the cross-validation, the ten-fold process was conducted. Multiple classifiers are used, and each classifier is validated by six measures such as recall rate, precision, accuracy, and name a few more. All the simulation of this work is conducted in MATLAB 2020a. The system used for this work is Corei7 with 16GB of RAM and 8 GB graphics card.
Three different angles are considered for the experimental process, such as 0, 18, and 180. The results are computed for both modified deep models, such as ResNet101 and InceptioV3. For all three angles, the results of the ResNet101 model are presented in
Methods | Recall rate (%) | Precision rate (%) | FNR | AUC | Accuracy (%) | Time rate |
---|---|---|---|---|---|---|
Linear SVM | 89.6 | 89.9 | 10.3 | 0.97 | 89.6 | 214 |
Quadratic SVM | 94.7 | 94.8 | 5.2 | 0.99 | 94.8 | 231.9 |
Medium GSVM | 92 | 92.2 | 8 | 0.98 | 92 | 277 |
Fine KNN | 92.2 | 92.4 | 7.7 | 0.94 | 92.3 | 303.5 |
Subspace KNN | 91.9 | 92 | 8 | 0.96 | 85.8 | 352.1 |
Weighted KNN | 85.8 | 87.4 | 14.2 | 0.96 | 85.8 | 352.1 |
Cosine KNN | 86.7 | 87.4 | 13.2 | 0.97 | 86.8 | 315.7 |
Cubic KNN | 84 | 85.4 | 16 | 0.95 | 83.9 | 1101 |
Medium KNN | 84.4 | 86 | 15.6 | 0.96 | 84.4 | 311.3 |
Methods | Recall rate (%) | Precision rate (%) | FNR | AUC | Accuracy (%) | Time rate |
---|---|---|---|---|---|---|
Linear SVM | 83.5 | 83.6 | 16.5 | 0.95 | 83.5 | 167.1 |
Quadratic SVM | 89.1 | 89.2 | 10.9 | 0.97 | 89.1 | 250.1 |
Medium GSVM | 86.3 | 86.4 | 13.6 | 0.96 | 86.4 | 494.3 |
Fine KNN | 87.1 | 87.3 | 12.9 | 0.90 | 87.1 | 583.8 |
Subspace KNN | 88.03 | 88.1 | 11.9 | 0.95 | 88 | 1930 |
Weighted KNN | 79.8 | 81.2 | 20.2 | 0.94 | 79.8 | 672 |
Cosine KNN | 78.3 | 78.8 | 21.6 | 0.93 | 78.3 | 644.5 |
Cubic KNN | 76.4 | 77.8 | 23.5 | 0.91 | 76.5 | 1743.7 |
Medium KNN | 76.3 | 77.7 | 23.6 | 0.92 | 76.4 | 591.9 |
Methods | Recall rate (%) | Precision rate (%) | FNR | AUC | Accuracy (%) | Time rate |
---|---|---|---|---|---|---|
Linear SVM | 95.4 | 95.5 | 4.6 | 0.99 | 95.5 | 148.6 |
Quadratic SVM | 97.9 | 97.8 | 2.1 | 1 | 97.8 | 171.4 |
Medium GSVM | 97 | 97 | 2.9 | 1 | 97 | 222 |
Fine KNN | 96.5 | 96.5 | 3.5 | 0.97 | 96.5 | 247 |
Subspace KNN | 96.5 | 96.6 | 3.4 | 0.99 | 96.6 | 902.5 |
Weighted KNN | 91.3 | 92 | 8.7 | 0.98 | 91.3 | 308 |
Cosine KNN | 91.6 | 92.1 | 8.3 | 0.98 | 91.7 | 266.5 |
Cubic KNN | 89.1 | 89.9 | 10.8 | 0.98 | 89.2 | 1236.4 |
Medium KNN | 89.5 | 90.3 | 10.4 | 0.98 | 89.5 | 259.3 |
Moreover, the time difference is not much higher; therefore, we consider cubic SVM better.
In the second phase, we implemented the proposed method for the modified inceptionV3 model. The results ate given in
Methods | Recall rate (%) | Precision rate (%) | FNR | AUC | Accuracy (%) | Time rate |
---|---|---|---|---|---|---|
Linear SVM | 83.9 | 84.2 | 16 | 0.96 | 84 | 136.5 |
Quadratic SVM | 91 | 90.9 | 9 | 0.97 | 91 | 162.5 |
Medium GSVM | 89 | 89.2 | 11 | 0.97 | 89.1 | 283 |
Fine KNN | 88.4 | 88.3 | 11.6 | 0.91 | 88.3 | 244.1 |
Subspace KNN | 88.1 | 88.1 | 11.9 | 0.94 | 88.1 | 767 |
Weighted KNN | 84.5 | 85.5 | 15.5 | 0.95 | 84.6 | 322.8 |
Cosine KNN | 84.8 | 85.3 | 15.2 | 0.96 | 84.7 | 302.7 |
Cubic KNN | 83 | 84.1 | 17 | 0.95 | 83.1 | 1117 |
Medium KNN | 83.3 | 88.3 | 11.7 | 0.91 | 88.3 | 262.3 |
Methods | Recall rate | Precision rate | FNR | AUC | Accuracy (%) | Time rate |
---|---|---|---|---|---|---|
Linear SVM | 82.5 | 82.6 | 17.5 | 0.94 | 82.5 | 820 |
Quadratic SVM | 92 | 92 | 8 | 0.98 | 91.9 | 1081.6 |
Medium GSVM | 90.5 | 90.7 | 9.4 | 0.98 | 90.5 | 366 |
Fine KNN | 93.1 | 93 | 7 | 0.95 | 93 | 502 |
Subspace KNN | 91.9 | 91.9 | 8.1 | 0.94 | 91.9 | 778.5 |
Weighted KNN | 88.4 | 89.3 | 11.6 | 0.98 | 88.4 | 778 |
Cosine KNN | 87.7 | 88.2 | 12.3 | 0.97 | 87.7 | 611.5 |
Cubic KNN | 85.5 | 86.3 | 14.5 | 0.96 | 85.5 | 512 |
Medium KNN | 87.4 | 88.1 | 12.6 | 0.97 | 87.4 | 598.8 |
Methods | Recall rate (%) | Precision rate (%) | FNR | AUC | Accuracy (%) | Time rate |
---|---|---|---|---|---|---|
Linear SVM | 90.1 | 90.4 | 9.8 | 0.98 | 90.1 | 295.5 |
Quadratic SVM | 96.03 | 96.06 | 3.9 | 0.99 | 96.1 | 342.8 |
Medium GSVM | 94.8 | 94.9 | 5.1 | 0.99 | 94.8 | 376 |
Fine KNN | 95.9 | 96 | 4.03 | 0.97 | 96 | 397 |
Subspace KNN | 90.4 | 91 | 9.5 | 0.98 | 90.5 | 427 |
Weighted KNN | 90.9 | 91.7 | 7.5 | 0.98 | 91.2 | 433.3 |
Cosine KNN | 89.2 | 90 | 8.02 | 0.94 | 89.9 | 1460 |
Cubic KNN | 91.02 | 91 | 7.6 | 0.96 | 91.4 | 457.7 |
Medium KNN | 95.4 | 95 | 4.53 | 0.99 | 95.5 | 1062.4 |
Reference | Year | Accuracy (%) |
---|---|---|
[ |
2020 | 92.0 |
[ |
2020 | 94.3, 93.8, 94.7 |
Proposed |
First, a brief discussion of the results section has been added to analyze the proposed framework. The results show that the proposed framework performed well on the chosen dataset. The accuracy of 0 and 180 degrees is better for modified ResNet101 and IACO, while the accuracy of 18 degrees is better for improved inceptionV3 and IACO. When compared to inceptionV3, the computational cost of improved ResNet101 and IACO is lower. Furthermore, the original computational cost of modified ResNet101 and InceptionV3 is nearly three times that of the proposed framework (applying after the IACO).