Siti Zulaikha Mohd Jamaludin1, Mohd. Asyraf Mansor2, Aslina Baharum3,*, Mohd Shareduwan Mohd Kasihmuddin1, Habibah A. Wahab4, Muhammad Fadhil Marsani1
1 School of Mathematical Sciences, Universiti Sains Malaysia, Minden, Penang, 11800, Malaysia
2 School of Distance Education, Universiti Sains Malaysia, Minden, Penang, 11800, Malaysia
3 Faculty of Computing and Informatics, Universiti Malaysia Sabah, Jalan UMS, Kota Kinabalu, 88450, Sabah, Malaysia
4 School of Pharmaceutical Sciences, Universiti Sains Malaysia, Minden, Penang, 11800, Malaysia
* Corresponding Author: Aslina Baharum. Email:
Computers, Materials & Continua 2023, 74(2), 2853-2870. https://doi.org/10.32604/cmc.2023.032654
Received 25 May 2022; Accepted 12 July 2022; Issue published 31 October 2022
Artificial Neural Network (ANN) is a subset of Artificial Intelligence that was inspired by artificial neurons. The primary aim of the ANN is to create black box model that can offer alternative explanation among the data. Using this explanation, one can use the output produced from ANN to solve various optimization problem. The main problem with conventional ANN is the lack of symbolic reasoning to govern the modelling of neurons. Reference  proposed logical rule in ANN by assigning each neuron to the variable of the logic. This leads to the introduction of Wan Abdullah method to find the optimal synaptic by comparing the cost function with the final energy function. Reference  proposed another variant of logic namely 2 Satisfiability (2SAT) in single layered ANN namely Discrete Hopfield Neural Network (DHNN). The proposed 2SAT was reported to obtain high global minima ratio if we optimize the learning phase of the DHNN. The discovery of this hybrid network inspires other study to implement 2SAT in ANN. Recently,  integrates 2SAT in Radial Basis Function Neural Network (RBFNN) by calculating the various parameters that leads to optimal output weight. The proposed work confirms the capability of the 2SAT in representing the modeling of the ANN. In another development,  proposed mutation DHNN by implementing estimated distribution algorithm (EDA) during retrieval phase of DHNN. This shows that the interpretation of the 2SAT logical rule in DHNN can be further optimized using optimization algorithm. The implementation of 2SAT in various network inspires the emergence of other useful logic such as [5–9] in doing DHNN. Various type of logical rule creates optimal modelling of DHNN that has wide range of behavior. Despite having various type of logical rule in this field, the exploration of different connectives among clauses is limited.
The most popular application of the logical rule in DHNN is logic mining. Reference  proposed the first logic mining namely Reverse Analysis (RA) method by implementing Horn Satisfiability in DHNN. The proposed logic mining managed to extract the logical relationship among the student datasets. One of the main issues of the proposed logic mining is the lack of focus of the obtained induced logic. In this context, more robust logic mining is required to extract single most optimal induced logic. Reference  proposed 2 Satisfiability Reverse Analysis Method (2SATRA) by introducing specific learning phase and retrieval phase that creates the most optimal induced logic. The proposed 2SATRA extracts the best induced logic for league of legends. The proposed logic mining was extended to various application such as Palm oil pricing [12,13] and football . After the introduction of 2SATRA in the field of logic mining,  proposed the energy-based logic mining namely E2SATRA by considering only global neuron state during retrieval phase of DHNN. In this context, the proposed E2SATRA capitalize the dynamics of the Lyapunov energy function to arrive at the optimal final neuron state. Note that, the final global neuron state ensures the induced logic produced by E2SATRA is interpretable. One of the main issues with the conventional 2SATRA is the possible overfitting issue due to ineffective connection of attribute during pre-processing phase. In other word, the attribute might possess the optimal connection with other variable in 2SAT clause, but the other possible connection was disregarded. The optimal logical rule will be less flexible and fail to emphasize the appropriate non-contributing attributes of a particular data set.
In this paper, the modified 2SATRA integrated with permutation operator will enhance the capability of selecting the most optimal induced logic by considering other combination of variable in 2SAT logic. The proposed modified 2SATRA will extract the optimal logical rule for various real-life datasets. Therefore, Thus, the correct synaptic weight during learning phase will determine the capability of the logic mining model and the accuracy of the induced logic generated during testing phase. This work focused on the impact of the logical permutation mechanism in Hopfield Neural Network (HNN) towards the performance of 2SATRA in the tasks data mining and extraction. The contribution of this paper is as follows:
(a) To formulate 2 Satisfiability that incorporates permutation operators which consider various combination of variable in a clause.
(b) To implement permutation 2 Satisfiability in Discrete Hopfield Neural Network by minimizing the cost function during learning phase that leads to optimal final neuron state.
(c) To embed the proposed hybrid Discrete Hopfield Neural Network into logic mining where more diversified induced logic has been proposed.
(d) To evaluate the performance of the proposed permutation logic mining in doing real life datasets with other state of the art logic mining.
The organization of this paper is as follows. Section 2 encloses a bit of brief introduction of 2 Satisfiability logical representation including the conventional formulations and examples. Section 3 focuses on the formulations of logical permutation on 2 Satisfiability Based Reverse Analysis methods. Thus, Section 4 explains the experimental setup including benchmark dataset, performance metrics, baseline method and experimental design. Then, the results and discussions are covered briefly in Section 5. Definitively, the concluding remarks are included in the final section of this paper.
Satisfiability (SAT) is a class of problem of finding the feasible interpretation that satisfies a particular Boolean Formula based on the logical rule. Based on the literature in , SAT is recognized to be a variant of NP-complete problem and incorporated to generalize a plethora of constraint satisfaction problems. Thus, the breakthrough of SAT research contributes to the development of the systematic variant of SAT logical representation, for instance, the 2 Satisfiability (2SAT). Theoretically, the fundamental 2SAT logical representation composes of the following structural features :
(a) Given a set of specified x variables, where (bipolar states) that illustrate the False and True outcomes correspondingly.
(b) A set of logical literals comprising either the positive variable or the negation of variable in terms of .
(c) Given a set of y definite clauses, in a set of logical rule. For every is connected to logical operator AND consecutively. Additionally, the 2 literals structure as given in (b) are well-connected by logic operator OR .
Based on the feature in (a) until (c), the precise definition of with different clauces can be seen as follows
whereby is a clause containing strictly 2 literals each
whereby the logical clauses in Eq. (3) are divided into 3 clauses such as , and . In particular, the aforementioned clauses must be satisfied with the appropriate bipolar interpretations with specific arrangements in align with the logical rule. Therefore, if the bipolar interpretation or assignment reads , yields the False outcome or −1. Due to the compatibility of with the ample information storage mechanism, we implemented into DHNN as a logical representation.
Specifically, the fundamental classification of DHNN with i-th activation is shown as follows
where and refer to the neuron threshold and second-order synaptic weight of the network correspondingly. In most of the DHNN research, is chosen as a standard threshold parameter. Note that N denotes the total number of 2SAT literals in a logical representation. Then, is defined as the connection between neuron and . This paper utilizes DHNN to avoid any intervention of the hidden layer. Hidden layer requires additional optimized parameters that potentially disrupt the signal of the local field in (4). In other word, suboptimal signal will leads to suboptimal synaptic weight which cause the final state to be trapped in local minimum energy. The thought of employing in DHNN (DHNN-2SAT) is due to the potential of the logical rule that can govern the output of the network symbolically. Thus, will take advantage of the DHNN content adressable memory as a remarkable storage especially to applied in logic mining.
Logic mining is a paradigm that used logical rule to simplify the information of the data set. Based on the inspiration of a study by , they have successfully utilized logic mining by implemented reverse analysis method in inducing all possible logical rules that generalize the behavior of the data set. However, the main task in assessing the behavior of the data set with the pre-defined goal is the extraction of correct logical rule so that it is efficiently evaluated the quality of data generalization. The structure of the optimum must consist the possible tractable inference, and capable to categorize the outcome of the real datasets. The conventional paradigm is by formulating and proposing a data mining method that capitalizes learned integrated with DHNN. 2 Satisfiability based Reverse Analysis Method (2SATRA) is a method that utilizes DHNN to learn and extract from a particular dataset with different levels of instances and attributes.
Given a set of data, where and x is the number of tested attributes. Note that, the number of tested attributes is randomly chosen from the factors that contribute to the outcome. Worth mentioning that, the role of 2SATRA is to find the final neuron state that maps from the learning neuron states. Throughout the learning phase, each dataset will be evaluated in order to find the synaptic weight by using Wan Abdullah Method . Tab. 1 illustrated all the possible synaptic weight for .
For instance, if the given dataset reads , 2SATRA will convert the logical assignments or interpretations into logical representation of . Based on Tab. 1, the acquired synaptic weight for are , and correspondingly. In this work, we proposed the permutation of the attributes in order to find the best interpretation that will generalize the behaviour of the data set. Therefore, the implementation of several possible permutations for such as in Eqs. (5) and (6).
Based on the Eq. (5), the possible permutation for is a as follows
In this context, the embedded to DHNN exhibits more possible attribute arrangement and we only considered the structure of in the learning phase of DHNN. Then, the will be selected as the if it comply the criteria as in Eq. (7).
where is the number of logical rule and is the acceptance tolerance range. The logical will determine the behaviour of the DHNN and the logical along with the acquired synaptic weight obtained will be stored in the content addressable memory for the retrieval phase purposes. The process of generating induced logical rules, for this programme is follows exactly from the conventional 2SATRA. Note that, the implementation of permutation attribute arrangements with the 2SATRA is abbreviated as P2SATRA. To further test the performance of P2SATRA, the obtained will be compared with the testing datasets, . Algorithm 1 illustrates the Pseudocode of the proposed P2SATRA while Fig. 1 shows the execution of the proposed P2SATRA.
Based on Fig. 1 and Algorithm 1, P2SATRA starts by identifying random logic which leads to . In this context, will be diregarded to ensure the Satisfiable property of the . After obtaining the synaptic weight via , P2SATRA proceed with the retrieval phase of the DHNN. The main difference between conventional 2SATRA with P2SATRA is the position of the attributes in the during learning phase and retrieval phase. In this context, the final neuron state of the proposed P2SATRA has bigger search space compared to conventional 2SATRA. Compared to other optimization method, permutation operator in Eq. (2) requires non-complex optimization problem to arrive to the optimal induced logic. Thus, P2SATRA only deals with permutation operatorto uncover possible combination of the connectives in .
In this paper, the impact of logical permutation in attaining the optimum induced logic is examined. Thus, the first 10 publicly available datasets from repository (B1-B10) are acquired from the open source UCI repository databases via https://archive.ics.uci.edu/ml/datasets.php. Moreover, 1 real life dataset (B11) is taken from Department of Irrigation and Drainage, Malaysia. Tab. 2 encloses the lists of datasets being used in this experiment. Based on the analysis from several previous works, this study utilizes the standard train-test split method, via 60% set as a learning data and the remaining 40% as a testing data . The data will be converted into bipolar representation (1 and −1) using k mean clustering as proposed by . The conversion will be applied in both learning and retrieval phase. To guarantee reproducibility of the result, the implementation code of our proposed P2SATRA with the datasets can be retrieved from https://bit.ly/3nyUdm8.
As the primary impetus of this work is to evaluate the quality of the induced logical representation generated by P2SATRA, we restrict the baseline methods comparison to the standard method only with the capability in attaining the induced logic from the real datasets. Tabs. 3–6 show the list of important parameters for various logic mining approaches. The core concern of combining more attributes is the possible increment of the learning error as the results of non-effective learning phase of HNN . Hence, the Hyperbolic activation function is applied to squash the final state of the neurons because of the capability and the behaviour of the functions such as the continuous, smooth, and non-linearity of the activation function. In retrieval phase of the logic mining method, the neuron initialization is set to be random in order to lessen the potential biasness of the network.
Various performance evaluations such as the sensitivity, precision analysis, F-Score and Matthews Correlation Coefficient (MCC) are employed to analyze and assess the overall capability and the significant effect of logical permutation in P2SATRA. The performance of the P2SATRA is calculated based on the confusion matrix. Specifically, (true positive) refers to the number of positive instances that correctly classed, (false negative) denotes the number of positive instances that incorrectly classified, (true negative) is the number of negative instances that correctly classified, whereas (false positive) demarcates the number of positive instances that incorrectly classified by the model. In the context of logic mining, can be calculated if and can be calculated if . Sensitivity metric, , examines the main positive result for an instance with respect to a particular condition.
Therefore, precision is employed to gauge the algorithm’s or model’s predictive capability. The computation and formulation for Precision is defined as follows:
Accuracy refers to the ordinary indicator for verifying the performance of the classification processes. Thus, the accuracy determines the percentage of instances categorized correctly (emphasis given on the true outcomes in confusion matrix):
F-Score is a substantial indicator that indicates the maximum probability of optimal result, clearly demonstrating the capability of the computational model. Moreover, F-Score is depicted as the harmonic mean of the two-performance metrics, which are the precision and sensitivity analysis.
In addition, Matthews Correlation Coefficient is utilized to quantify the execution of the entire logic mining approaches by taking into account the eight major derived ratios from the amalgamation of the entire elements of a confusion matrix. The is given:
All simulations will be implemented and executed by employing the Dev C++ Version 5.11 software due to the versatility of the programming language and the user-friendly interface of the compiler. Hence, the simulations will be implemented in C++ language by using computer with Intel Core i7 2.5 GHz processor, 8GB RAM and Windows 8.1. Following that, the threshold CPU time for each execution was set 24 h and any possible outputs that go beyond the threshold time were omitted entirely from the analysis. The overall experiments were executed by using the similar device to prevent possible bad sector in the memory during the simulations.
This study created the 2SATRA integrated with HNN-2SAT to simulate and analyze the effect of logical permutation, forming P2SATRA. The composition of attributes will be randomly permuted as opposed to the previous 2SATRA models [11,12]. In this work, the comparison of our proposed P2SATRA will be examined with the conventional logic mining models such as RA, 2SATRA and E2SATRA methods.
The results of Acc, Pr, Se, F-Score and MCC for four variants of logic mining apporaches can be viewed in Tab. 7 until Tabs. 8 and 11. Then, Tab. 12 encloses the induced logic of obtained for 11 real datasets. According to the results, there are several successful dominances and strength points for P2SATRA which are enclosed based on the analysis of the different performance metrics. Based on analysis, P2SATRA achieve the maximum optimal values for 11 real datasets, including the time-series dataset in B11. This manifests the capability of the logical permutation in P2SATRA in enhancing the accuracy of the logic mining for the entire datasets used in this work. According to the thorough observation, the next feasible models that compete in terms of with P2SATRA are RA and E2SATRA. This implies that the proposed logic mining model has been enhanced by using permutation operator in diversifying the induced logics that lead to higher accuracy by tuning the high permutation parameter (maximum of 100 permutation/execution). Based on Tab. 1, all of the accuracy recorded by P2SATRA achieved which confirms the capability to correctly differentiate and for all datasets in this study. Following that, there were three datasets (B4, B9, and B11) that attain which implies P2SATRA accurately predict all value of and . This shows that the capability of the proposed P2SATRA to work well with time-series datasets, which require proper enumerations in attaining the best induced logic as compared with the three counterparts. Interesting observation can be found where the 2SATRA and RA recorded the zero during the execution with B11 dataset. The 100% differences in the Acc of P2SATRA as opposed to the standard logic mining approaches in B11 just confirmed the significant effect of logical permutation with the effective synaptic weight management during time-series data extraction. Statistically, P2SATRA has recorded an exceptional average rank of 1.045 for the accuracy, which 286% lower than RA and E2SATRA plus about 322.7% lower than 2SATRA.
(a) For , P2SATRA outperforms the other logic mining models in 7 out of 11 datasets. The higher values of indicates the superiority of the proposed model to retrieve and generate more . Hence, the nearest model that strongly compares with P2SATRA is E2SATRA. However, no values were reported in B2, B5 and B11 datasets indicating the failure to retrieve any value for . This is occurring because P2SATRA and the other logic mining models fail to retrieve value of positive outcomes, consisting of and . The proposed P2SATRA has achieved value for 3 real datasets which entails P2SATRA correctly predict the tested data in evaluation with all the positive outcomes. One interesting result was recorded by 2SATRA for B9, where the implying the models fail to attain any values in the confusion matrix. This shows the major weakness of standard 2SATRA that requires reinforcement via the logical permutation approach. To support that, the 2SATRA has obtained average rank of 3.1878 which approximately 230% higher than the average rank for P2SATRA.
(b) For result of , P2SATRA outperforms other logic mining model in 9 out of 11 datasets. In addition, according to the F-Score analysis, P2SATRA has recorded exceptional results in 10 out of 11 datasets as compared to 2SATRA, RA and E2SATRA. However, both results of and F-Score for B11 is not able to retrieve any value due to the failure to generate any positive and negative outcomes. This highlights the similar capability of P2SATRA with the other logic mining models when being assessed with and F-Score for B11, which a variant of time-series datasets. Overall, the nearest model that competes with P2SATRA is E2SATRA with the average rank of 2.500. Moreover, P2SATRA has an average rank of 1.909 which is the peak as compared to other conventional logic mining approaches based on the analysis. In addition, P2SATRA has recorded the superior average rank of F-Score with 1.545 with almost 99.5% lower than the worst 2SATRA with average rank of 3.000. Hence, both findings statistically authenticate the acceptable performance of P2SATRA for most of multivariate datasets as opposed to the conventional logic mining approach.
(c) As well to , logic mining model of P2SATRA shows the highest optimal value among other model in 6 out of 11 datasets. In meantime, 5 datasets in are not able to retrieve any value. No value of reported in B2, B4, B5 and B11 for all logic mining methods due to the non-existence of positive outcomes for these datasets. In addition, no MCC value recorded for B7 data due to P2SATRA is not able to retrieve value of and . However, the performance of P2SATRA has been exceptional as recorded the best for 55% of the datasets. The datasets that are equal to zero and approaching zero are B10 and B9 dataset respectively. This demonstrates P2SATRA has excellent capability in distinguishing all outcome domains of the confusion matrix and a solid proof of powerful predictive capability.
(d) According to the average rank for all the data sets in terms of , , , F-Score and , it shows that P2SATRA has the highest average rank compared to other model. It has been statistically proven, where 2SATRA has been the weakest of the other conventional logic mining approaches, when being trained and tested with the real datasets in this study.
(e) The further analysis via Friedman test rank has been performed for all 11 datasets with and degree of freedom, . The p-value for , , , F-Score and are , , , and , respectively. In terms of , and F-Score, the null hypothesis of equal performance for all the logic mining models were rejected. P2SATRA recorded the average rank of 1.045, 1.375 and 1.545 in terms of , and F-Score, respectively, which highest compared to other existing models. As can be seen, E2SATRA is the nearest method that competes with P2SATRA with an average rank of 2.864, 2.688 and 2.818 for , and F-Score, respectively. Thus, it shows that the result for , and F-Score are statistically validating the dominance performance of P2SATRA. However, the null hypothesis of equal performance for all the logic mining models were accepted in terms of and . It can be concluded that there no difference performance of P2SATRA with other models for and .
In this work, a new alternative approach of attaining the optimal induced logic entrenched in any of multivariate or time-series datasets by introducing the logical permutations in 2SATRA has been successfully developed. The enhancements can be seen clearly in the substantial accuracy improvement of the proposed model as opposed to the existing approach, indicating the success in the generalization of the datasets. In this study, we have exploited the multi-connection between the attribute arrangements in generating the with different accuracy values. Given the high expressibility and interpretibility of the proposed P2SATRA, the effects of the logical permutations have been very significant and substantial. By adapting various forms of 2SAT logical structure during the learning phase of HNN, P2SATRA outperfomed the 2SATRA, E2SATRA and RA approaches when being measured with the performance metrics such as the accuracy, precision, sensitivity, F-Score and MCC after the logic mining analysis with 11 different real datasets. For instance, it will be interesting to infuse different logical rule such as Maximum Satisfiability , Y-Type Random Satisfiability , G-Type Random Satisfiability  and Random k Satisfiability . In terms of network architecture, it will be interesting if other learning mechanism such as in [21,22] were embeded into logic mining.
Acknowledgement: The authors would like to thank all AIRDG members and those who gave generously of their time, ideas and hospitality in the preparation of this manuscript.
Funding Statement: This research was supported by Universiti Sains Malaysia for Short Term Grant with Grant Number 304/PMATHS/6315390.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.