|Computers, Materials & Continua |
Hyperparameter Tuned Deep Learning Enabled Cyberbullying Classification in Social Media
1Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam bin Abdulaziz University, Saudi Arabia
2Department of Electrical Engineering, College of Engineering, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
3Department of Information Systems, College of Computing and Information System, Umm Al-Qura University, Saudi Arabia
4Department of Computer Science, College of Science & Art at Mahayil, King Khalid University, Muhayel Aseer, Saudi Arabia
5Faculty of Computers and Information, Computer Science Department, Menoufia University, Egypt
6Research Centre, Future University in Egypt, New Cairo, 11845, Egypt
7Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia
8Department of Information System, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al Kharj, Saudi Arabia
*Corresponding Author: Mesfer Al Duhayyim. Email: firstname.lastname@example.org
Received: 10 April 2022; Accepted: 20 May 2022
Abstract: Cyberbullying (CB) is a challenging issue in social media and it becomes important to effectively identify the occurrence of CB. The recently developed deep learning (DL) models pave the way to design CB classifier models with maximum performance. At the same time, optimal hyperparameter tuning process plays a vital role to enhance overall results. This study introduces a Teacher Learning Genetic Optimization with Deep Learning Enabled Cyberbullying Classification (TLGODL-CBC) model in Social Media. The proposed TLGODL-CBC model intends to identify the existence and non-existence of CB in social media context. Initially, the input data is cleaned and pre-processed to make it compatible for further processing. Followed by, independent recurrent autoencoder (IRAE) model is utilized for the recognition and classification of CBs. Finally, the TLGO algorithm is used to optimally adjust the parameters related to the IRAE model and shows the novelty of the work. To assuring the improved outcomes of the TLGODL-CBC approach, a wide range of simulations are executed and the outcomes are investigated under several aspects. The simulation outcomes make sure the improvements of the TLGODL-CBC model over recent approaches.
Keywords: Social media; deep learning; cyberbullying; cybersecurity; hyperparameter optimization
With the rising number of users on social media results in a new method of bullying. The latter term is termed as aggressive or intentional move that is performed by individuals or groups of persons by recurrently transmitting messages over time toward a sufferer who is not in a state to defend him or herself . Bullying has become a portion of the community forever. With the emergence of the internet, it has been only a matter of time unless bullies discover their way on to this opportunistic and new medium. According to National Crime Prevention Council Cyberbullying (CB) is termed as the usage of mobile phones, Internet, or another device for sharing or posting messages or pictures that intentionally hurts or embarrasses any other individual. Several researchers state that amongst 10%–40% of internet consumers are considered the sufferer of CB [2,3]. Effects of CB have serious effects which may range from impermanent anxiety to suicide.
Identification of CB in mass communication is regarded as difficult job. Definition of what represents CB is completely subjective . For instance, according to common people, recurrent usage of swear texts or messages may be regarded as bullying. But, teen-related mass media platforms namely Form spring are not considered bullying . CBs attack victims on distinct subjects namely religion gender, and race. Based on the subjects of CB, glossary and perceived meaning of words change remarkably. CB awareness is raised in most countries because of the consequences described in this work. Correspondingly, most of the authors submitted their works by machine learning (ML) methods for identifying CB in an automatic way [6,7]. However, many studies in this domain have been carried out for the English language. Moreover, the study which is carried out usually utilizes text mining methods which are equivalent to the sentiment analysis study . The reality is mass media post are context-dependent and interactive; therefore, it is not regarded as a standalone message. A lot of research work was conducted on explicit speech identification . Still, more volume of studies becomes necessary for solving the implicit language which makes identifying cyber-bullying in mass communication a difficult task . Nonetheless, advancement was made in the identification of cyber-bullying by methods of deep learning (DL) and ML. But many recent studies have to be advanced for offering an appropriate method which includes clear and indirect elements.
The authors in  presented XP-CB, a new cross-platform structure dependent upon Transformer and adversarial learning. The XP-CB is to improve a Transformer leveraging unlabelled data in the source and target platforms for appearing with a general representation but preventing platform-specific trained. The authors in  relate the performance of several words embedded approaches in fundamental word embedded approaches for present advanced language techniques for CB recognition. It can be utilized LightGBM and Logistic regression (LR) techniques for the classifier of bullying and non-bullying tweets. Murshed et al.  examine the hybrid DL technique named deep autoencoder with recurrent neural network (DEA-RNN), for detecting CB on Twitter social media networks. The presented DEA-RNN technique integrates Elman form RNN with an optimizing Dolphin Echolocation Algorithm (DEA) to fine-tune the Elman RNN parameter and decrease training time.
Dewani et al.  executed wide pre-processing on Roman Urdu micro-text. This classically includes development of Roman Urdu slang- phrase dictionary and mapping slang then tokenization. The unstructured data are more managed for handling encoder text format and metadata or non-linguistic feature. Also, it can be implemented wide experimental by applying convolution neural network (CNN), RNN-long short term memory (LSTM), and RNN-bidirectional LSTM (BiLSTM) techniques. Ahmed et al.  presented binary and multi-class classifier methods utilizing hybrid neural network (NN) for bully expression recognition in Bengali language. It can be utilized 44,001 users comment on popular public Facebook pages that fall as to 5 classes Religious, Sexual, Non-bully, Threat, and Troll.
This study introduces a Teacher Learning Genetic Optimization with Deep Learning Enabled Cyberbullying Classification (TLGODL-CBC) model in Social Media. The proposed TLGODL-CBC model intends to identify the existence and non-existence of CB in social media context. Initially, the input data is cleaned and pre-processed to make it compatible for further processing. Followed by, independent recurrent autoencoder (IRAE) model is utilized for the recognition and classification of CBs. Finally, the TLGO algorithm is used to optimally adjust the parameters related to the IRAE model. To assuring the improved outcomes of the TLGODL-CBC method, a wide range of simulations are executed and the outcomes are investigated in several aspects.
In this study, a new TLGODL-CBC model was established for identifying the existence and non-existence of CB in social media context. At the primary stage, the input data is cleaned and pre-processed to make it compatible for further processing. Then, the pre-processed data is fed into the IRAE method for the recognition and classification of CBs. Finally, the TLGO algorithm is used to optimally adjust the parameters related to the IRAE model and thus results in enhanced classification performance. Fig. 1 depicts the overall block diagram of TLGODL-CBC technique.
Primarily, the data was analyzed to a maximum degree of refinement by executing and subsequent the step stipulated under punctuation removal, Stop word remover, sentence segmentation, tokenization, and lower casing. These are the stages which are obtained to have data decreased to size, and so, it can be also eliminated unwanted data which is established from the data. During this method, it can be generated a generic pre-processed which resulted from the elimination of punctuation and is some non-letter character in every document. At last, the letter case of all the documents is lowered. The outcome of this method provided us with a sliced document text dependent upon n length with n-gram word-based tokenizer. The next stage then it had gone with the tokenization model is for transforming the token as to another normal format. Similar to stemming, the resolve of lemmatization is for minimizing inflectional procedures for a particular base procedure. Preposition and conjunction, Article, and some pronouns, i.e., are assumed to stop words.
At this stage, the pre-processed data is fed into the IRAE technique for the recognition and classification of CBs. The standard RNNs are inclined to gradient vanishing and explosion in the training because of the repeating multiplication of weighted matrix. IRNN is a novel recurrent network which efficiently resolves the gradient problem by altering the time-based gradient back propagation [16–20]. Also, IRNN is employ unsaturated function (i.e., rectified linear unit (ReLU)) as an activation function and retains higher robustness and then training. The IRNN network is formulated as:
In which the input weighted W is a matrix, the recurrent weighted u is a vector, and signifies the Hadamard products. All the neurons at time t are only linked to their individual neuron at time as elementwise multiplication was implemented rather than vector multiplication. The single neuron is formulated as:
whereas denotes the row of W and implies the row of u matrix. The gradient of all the neurons is independently computed in back propagation as no communication occurs amongst the neuron from a similar layer. Considering the objective that minimizing at time step T is , once the gradient was propagating back to step T, the computation outcomes are expressed as:
implies the derivation of activation function, and the derivation of ReLU activation function is both and one. If the gradient at time step T propagates to time step t, the minimal and maximal effectual gradient is considered that and correspondingly. If every neuron fulfills the criteria the gradient could not be vanishing or exploding at time t. In order to generally employed activation functions like rectified linear unit (ReLU) and tanh, its derivative is not superior to 1, for instance, . Particularly to function ReLU utilized under this case, their derivative is either or 1. IRAE integrates the features of IRNN and auto encoder (AE). Accordingly, IRAE is a recurrent connection infrastructure that allows it for training and transferring parameters besides the time direction. The single IRAE is also an unsupervised learning NN which efforts for reproducing the input by comparing the last output to input. The IRAE primary encoded the input based on Eq. (4) for obtaining the hidden expression next, it executes the decoded function on based on Eq. (5) for obtaining the reconstructed formulated .
In which refers the encoder weighted matrix, u signifies the recurrent weighted matrix, b denotes the encoded bias vector, represents the decoded matrix, and c implies the decoder bias vector. For making the network sparse and enhance the trained speed, the ReLU function is chosen as the activation function during this case. Fig. 2 showcases the framework of RAE.
2.3 TLGO Based Parameter Optimization
Lastly, the TLGO algorithm is used to optimally adjust the parameters related to the IRAE model and thus results in enhanced classification performance. We have projected TLGO approach by integrating the feature of genetic algorithm (GA) with teaching learning based optimization (TLBO) . Here, exploration and exploitation are 2 major features that should be considered. To accomplish superior outcome, there needs to be a balance among global and local searching agents. TLBO is well performed in exploitation phase, that is, finding the optimal solution in local searching space, and poorly performed in exploration phase. To resolve the imbalance between exploitation and exploration stage, we developed TLGO method by adding the mutation and crossover operators of GA into TLBO algorithm. GA performed well in exploration phase and has better convergence rate. At first, TLGO approach has similar step as TLBO step. The first two stages of TLBO approach that is., teacher and learner phases are added to TLGO method without making any changes. In learner stage, each learner interacts with others for improving their knowledge. For every learner , is selected arbitrarily and the population is updated using. The designated parent reproduced a novel offspring by carrying out the crossover operator. The mutation step can be used to arbitrarily invert a bit of chosen candidate and population can be upgraded when the mutation operation is implemented. Next, the fitness of population can be evaluated by using. The procedure repeats until end condition is satisfied.
The TLGO method uses exploration as well as exploitation phases and maintains a balance in local and global searching agents by using mutation and crossover operators of GA to TLBO technique. The presented method has good convergence rate with lesser computation effort, interms of finding an optimum solution. Algorithm 1 illustrates the working process of the presented TLGO approach.
The experimental validation of the TLGODL-CBC model is tested using a benchmark dataset from Kaggle repository . The dataset holds images under two class labels (Insult-1049/Normal-2898). Tab. 1 illustrates the dataset details.
The confusion matrices generated by the TLGODL-CBC model on distinct sizes of training (TR) and testing (TS) data are illustrated in Fig. 3. With 80% of TR data, the TLGODL-CBC model has recognized 743 samples into insult and 2212 samples into normal. Moreover, with 20% of TS data, the TLGODL-CBC approach has recognized 187 samples into insult and 548 samples into normal. Furthermore, with 70% of TR data, the TLGODL-CBC system has recognized 667 samples into insult and 1903 samples into normal. At last, with 30% of TS data, the TLGODL-CBC algorithm has recognized 282 samples into insult and 832 samples into normal.
Tab. 2 reports a brief classifier result of the TLGODL-CBC model with 80% of TR data and 20% of TS data. The experimental results indicated that the TLGODL-CBC model has reached effectual outcomes in both cases. For instance, with 80% of TR data, the TLGODL-CBC model has offered average , , , and of 93.60%, 91.59%, 92.09%, and 91.83% respectively. Also, with 20% of TS data, the TLGODL-CBC technique has obtainable average , , , and of 93.04%, 91.02%, 91.38%, and 91.20% correspondingly.
Tab. 3 defines a detailed classifier result of the TLGODL-CBC model with 70% of TR data and 30% of TS data. The experimental outcomes outperformed that the TLGODL-CBC approach has reached effectual outcomes in both cases. For instance, with 70% of TR data, the TLGODL-CBC approach has accessible average , , , and of 93.05%, 90.44%, 92.32%, and 91.31% correspondingly. Eventually, with 30% of TS data, the TLGODL-CBC model has offered average , , , and of 94.01%, 92.07%, 91.67%, and 92.36% correspondingly.
Fig. 4 provides an average result analysis of the TLGODL-CBC model with varying TR/TS data. The figure reported that the TLGODL-CBC model has accomplished effectual outcomes under distinct sizes of TR/TS data.
Fig. 5 offers the accuracy and loss graph analysis of the TLGODL-CBC method on distinct TR/TS datasets. The results exhibited that the accuracy value inclines to improve and loss value inclines to reduce with a higher epoch count. It can be also exposed that the training loss is lower and validation accuracy is higher on the test dataset.
To assuring the enhanced performance of the TLGODL-CBC model, a comparison study is made with recent models in Tab. 4 . Fig. 6 provides a comprehensive comparative examination of the TLGODL-CBC model with recent models. The figure implied that the RNN and support vector machine (SVM) models have resulted to lower values of 80.12% and 82.43% respectively. Along with that, the BLSTM and LSTM models have accomplished slightly enhanced values of 85.30% and 84.27% correspondingly. Along with that, the gated recurrent unit (GRU) and random forest (RF) models have accomplished reasonably values of 88.97% and 87.51% respectively. However, the TLGODL-CBC model has reached maximum of 92.07%.
Fig. 7 offers a detailed comparative analysis of the TLGODL-CBC approach with recent models. The figure implied that the RNN and SVM models have resulted to lower values of 83.69% and 82.38% correspondingly. Likewise, the BLSTM and LSTM models have accomplished slightly enhanced values of 89.93% and 83.69% respectively. Followed by, the GRU and RF techniques have accomplished reasonably values of 81.83% and 89.93% correspondingly. Finally, the TLGODL-CBC technique has reached maximum of 92.67%.
Fig. 8 illustrates a brief comparative investigation of the TLGODL-CBC system with recent models. The figure exposed that the RNN and SVM approaches have resulted in lesser values of 81.99% and 83.20% respectively. Likewise, the BLSTM and LSTM models have accomplished somewhat improved values of 89.50% and 88.51% correspondingly. In addition, the GRU and RF models have accomplished reasonably values of 82.77% and 83.84% correspondingly. Finally, the TLGODL-CBC model has reached maximum of 94.01%.
Fig. 9 demonstrates a comprehensive comparative analysis of the TLGODL-CBC technique with recent algorithms. The figure implied that the RNN and SVM models have resulted to lower values of 89.07% and 88.63% respectively. Also, the BLSTM and LSTM models have accomplished slightly enhanced values of 85.01% and 81.11% respectively. Next, the GRU and RF models have accomplished reasonably values of 81.73% and 82.01% correspondingly. At last, the TLGODL-CBC model has reached maximum of 92.36%. After observing the aforementioned tables and figures, it can be apparent that the TLGODL-CBC model has exhibited maximum performance over other methods.
In this study, a novel TLGODL-CBC model was established for identifying the existence and non-existence of CB in social media context. At the primary stage, the input data is cleaned and pre-processed for making it compatible for further processing. Then, the pre-processed data is fed into the IRAE technique for the recognition and classification of CBs. Finally, the TLGO algorithm is used to optimally adjust the parameters related to the IRAE model and thus results in enhanced classification performance. To assuring the enhanced outcomes of the TLGODL-CBC technique, a wide range of simulations was carried out and the outcomes are investigated in several aspects. The simulation outcomes ensured the improvements of the TLGODL-CBC model over recent approaches. Thus, the TLGODL-CBC model has exhibited reasonable performance over other methods. In future, hybrid DL classification models can be included to boost the overall classification performance.
Funding Statement: The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under Grant Number (RGP 2/46/43). Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R140), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4210118DSR12).
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
|This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.|