Forecasting of Appliances House in a Low-Energy Depend on Grey Wolf Optimizer

: This paper gives and analyses data-driven prediction models for the energy usage of appliances. Data utilized include readings of temperature and humidity sensors from a wireless network. The building envelope is meant to minimize energy demand or the energy required to power the house independent of the appliance and mechanical system efficiency. Approx-imating a mapping function between the input variables and the continuous output variable is the work of regression. The paper discusses the forecasting framework FOPF (Feature Optimization Prediction Framework), which includes feature selection optimization: by removing non-predictive parameters to choose the best-selected feature hybrid optimization technique has been approached. k-nearest neighbors (KNN) Ensemble Prediction Models for the data of the energy use of appliances have been tested against some bases machine learning algorithms. The comparison study showed the powerful, best accuracy and lowest error of KNN with RMSE = 0.0078. Finally, the suggested ensemble model’s performance is assessed using a one-way analysis of variance (ANOVA) test and the Wilcoxon Signed Rank Test. (Two-tailed P-value: 0.0001).


Introduction
Many research studies have been introduced to understand the energy appliances which use in buildings. The appliances represent a significant portion (between 20% and 30%) of the electrical energy demand appliances, such as televisions and consumer electronics operating in standby, attributed to a 10.2% increase in electricity consumption. Many Regression models for energy can be used to comprehend the connections between different factors and evaluate their effect [1].
Prediction models of electrical vitality utilization in structures can be helpful for various applications. For example, it is used to decide sufficient estimating of photovoltaics and vitality stockpiling. In addition, lessen power flow into the grid, model predictive control applications where the heaps are required for demand-side management (DSM) and demand-side response (DSR). Furthermore, it uses to evaluate building performance simulation analysis [2].
The power utilization in residential structures is clarified by two primary factors: the sort and number of electrical apparatuses and the utilization of the machines by the inhabitants. In addition, the structure in various areas could likewise decide the utilization of the devices.
In the indoor condition close to the region of the apparatus, for example, the temperature, mugginess, vibrations, light, and commotion. Typically, the two components are interrelated [3].
The no-free lunch NFL theorem and attempts to address existing flaws inspired us to create the suggested optimization method. Unfortunately, there is no one-size-fits-all meta-heuristic that can solve all optimization issues. It explains why some meta-heuristics are better at solving certain optimization problems than others. As a result, additional optimizations continue to be offered. Slow convergence, the balance between exploration and exploitation, and stagnation into local optima are some of the shortcomings of existing optimization methods addressed by our suggested optimization algorithm [4].
The main goal of this paper is to understand the relationships between appliances' energy consumption and different predictors. And introduce prediction models can deal with appliances energy dataset. The outline of this paper is proposed our framework (Feature Optimization Prediction framework) FOPF discuss its performance compared with different models (linear regression, Artificial Neural Network, and Random Forest) to predict energy consumption.

Literature Review
Many research discusses electricity load prediction to identify the parameters. Typically studies have used models such as multiple regression, neural networks, support vector machines, etc., for forecasting the electricity demand. The models ordinarily have considered parameters such as wind speed, open-air temperature, end of the week, occasions, worldwide sun-powered radiation, and hour of the day [5].
Most paper in this topic highlights the following points [6]: • The vitality utilization of appliances to a significant portion of the collected power request • The vitality utilization of appliances might be separated into different commitments and here and there may incorporate HVAC (heating, ventilation, and air conditioning) • Is the weather agent enough to improve the apparatuses' vitality utilization expectation? • Could the temperature and dampness estimations from a remote system help in the vitality expectation? • Which parameters are the most significant in vitality expectation?
Meta-heuristic optimization algorithms have grown in popularity due to their ease of use and flexibility compared to traditional and precise optimization methods such as Greedy Search and Local Search. Furthermore, those meta-heuristic optimization algorithms are versatile because they may be used in various domains and applications without requiring significant design and implementation adjustments. Likewise, owing to their stochastic character, they can avoid local optima by exploring the search space extensively and avoiding stagnation in local optima [7].

The Proposed FOPF
The point of regression analysis is to anticipate a result dependent on verifiable information. However, in some genuine relapse issues, one regularly experiences uncertain information, including non-useful highlights, which significantly expands the blunder of the calculations. To beat this issue, we propose FOPF, which comprises two stages: First Phase: Feature Selection Optimization, Second Phase: Training Ensemble Model. As shown in Fig. 1.

First Phase: Feature Selection Optimization
Feature analysis is usually recommended before regression analysis.

Dataset
"The data set is at 10 min for about 4.5 months. The house temperature and humidity conditions were monitored with a ZigBee wireless sensor network."

Removal of Null Values
We removed rows that contain null values or contains any missing data.

Feature Scaling (between 0-1)
As shown in the equation for scaling, we used Min-Max-Scalar to bring the attribute's value between 0 and 1.

Information gain
IG ranks each feature according to its entropy and selects the most important features according to the prespecified threshold to complete feature selection for the preprocessed dataset hybrid optimization technique has been approached by equipping it with three powerful algorithms. The first set of rules is PSO, in which individuals are shifting influenced with the aid of their local best positions and by the high-quality global position. The 2d optimizer in our proposed hybrid approach is Grey Wolf Optimizer. GWO is a swarm-primarily based meta-heuristic optimizer that mimics the social hierarchy and the foraging conduct of the grey wolves. Individuals in GWO circulate influenced utilizing the three leader's alpha, beta, and delta areas. The third is a genetic algorithm that accrues convergence with decorating the position of a selected solution around randomly selected leaders called Mutation operator random modifications one or more additives of the offspring.

Particle Swarm Optimization
Velocity is the position Change of a particle. During this time, the position of the particle is changed. At the flight, the particle's velocity is randomly accelerated toward its previous best position and a neighborhood best solution v k+1

Grey Wolf Optimizer
GWO is originated from mathematically formulating the hunting behavior used by the grey wolves hunting technique as shown in Fig. 2. The position of each wolf is updated using the following equations: where t refers to the current iteration, A and C are coefficient vectors, X p is the preposition, and X is the position of the gray wolf. The vectors are calculated using the following equation:

Mutation
The random modification of portions of a solution, which enhances population diversity and provides a means for escape from a local optimum, is presented. The addition of new traits to the population pool may be beneficial, in which case the mutated person has a high fitness value and is likely to be selected many times, or it may be detrimental, in which case the individual is removed from the population pool. Mutation operator involves creating three indices randomly in range over [1, n] where n is the population size.

Crossover
In chromosomes or solution representations, they are swapping portions of the solution with another. The basic function is to offer mixing and convergence of solutions in a subspace. Crossover between the new mutant solution vector Vi and the original solution vector Xi according.

Elitism
Elitism or fittest's survival is defined as the use of high-fit solutions to pass on to future generations, which is generally done in the form of some sort of best-solution selection.

Second Phase: Training Ensemble Model
In addition to feature selection Optimization, we have divided the dataset into two parts: training data (75%) and testing data (25%). The original dataset contains 19737 records. The training data comprises 14804 records, and the testing data contains 4933 records. The training data is used to train (linear regression, Artificial Neural Network, and Random Forest) used to calculate the error, and the accuracy of each then compared the results with our proposed KNN ensemble.

Linear regression
Linear regression constitutes the relation between two variables by fitting a linear equation to the observed data. One variable is regarded as an explanatory variable, while the other is viewed as a dependent variable. A modeler may, for example, use a linear regression model to connect people's weights to their heights. Linear regression is used to model the linear relationship between a based variable and one or more dependent variables. The linear regression model was able to predict the value with RMSE = 0.0131.

Artificial Neural Network (ANN)
An artificial neural network is made up of three or more linked layers. Input neurons make up the first layer. These neurons send input to deeper layers, then deliver the final output data to the output layer. Inner layers are concealed and are produced by units that use a series of transformations to adjust the information received from layer to layer. Multi-Layer perceptron (MLP) is a type of feedforward neural network with one or more hidden layers among the input and output layers. MLP layers are entirely connected, meaning that a neuron in one layer is hooked up to all neurons within the following layer. Each connection has a distinctive weight value. MLP is educated with a backpropagation algorithm. The network was able to predict the value with RMSE = 0.0944.

Random Forest Algorithm
Because it uses both bagging and feature randomness to generate an uncorrelated forest of decision trees, the random forest technique extends the bagging method. Feature bagging is another term for feature randomness. Assures minimal correlation among decision trees by generating a random selection of features. A fundamental distinction between decision trees and random forests is this. Random forests only evaluate a subset of the available feature splits, whereas decision trees consider all. The random forest algorithm comprises a group of decision trees, each of which is made up of a bootstrap sample selected from a training set. Combines a couple of decision trees to enhance the generalizability and the robustness over an unmarried choice tree. The Random Forest was able to predict the value with RMSE = 0.0123.

Proposed KNN Ensemble
Ensemble approaches, which combine several regressors, have received much interest in the recent two decades as potential strategies for improving the regression performance of weak learners. In many real-world applications, these approaches reduce regression error significantly and, in general, are more resistant to non-informative data characteristics than individual models. However, using an ensemble to improve KNN performance is a difficult challenge. Because KNN is already a stable regressor, the "traditional" ensemble approaches of bagging and boosting do not work with it. The component regressors must be correct to improve performance using an ensemble.
The ensemble method is combining the output of algorithms by giving each weight. Ensemble techniques turn out to be one of all the good-sized techniques in improving the foreseeing potential of standard models as shown in Fig. 3.

Figure 3: KNN ensemble
The output value of K training records became chosen as the nearest friends are used to predict the output cost of the unknown testing statistics. KNN regression use the subsequent formula KNN Ensemble was able to predict the value of the lowest error in comparison with others by RMSE = 0.0078.

Results and Discussion
KNN gives the lowest error compared to the three-based model. KNN can control overfitting and handle missing values. This method seeks the understanding of crowds in predicting KNN regression. RMSE, MAE Results was shown in Tab. 1. KNN Ensemble Real Predicted Values shown in Fig. 4. Also, a positive correlation between appliances' consumption and weather conditions was found. shown in Fig. 5   We will need evidence to back our strategy in order to make a confident and trustworthy conclusion. This is where the ANOVA idea comes in handy as in Tab. 2. The Q-Q plot, also known as a quantile-quantile plot, is a graphical tool that may be used to determine if a collection of data is likely to have come from a theoretical distribution such as a Normal or exponential distribution, which is shown in Fig. 6. Tab. 3 illustrated the Wilcoxon Signed Rank Test.
Internally, histograms are used to summarize data and offer size estimates for searches, as shown in Fig. 7. Because these histograms are not offered to consumers or exhibited physically, a larger range of possibilities for their creation are accessible.

Compression with Optimization algorithms
The compression methods introduce many optimization algorithms such as genetic algorithm GA and practical swarm optimization PSO and grey wolf optimization GWO, which appear to be a strong the proposed framework FOPF quality and reducing the limitations of cost and superiority as results show in Tab. 4 and Fig. 8.

Conclusions
Appliance energy consumption was ranked first in significance for energy forecasting: feature ranking and data filtering to eliminate non-predictive factors. Although FOPF has high performance in the feature Selection case, it depends on a hybrid optimization technique. Therefore, using various machine learning techniques, there is a need to control energy consumption by employing various optimization techniques and forecasting demand. Nevertheless, the comparative results show the performance. Furthermore, they can be justified because the KNN Ensemble is appropriate for dealing with appliances' energy forecasting and a positive correlation between appliances' consumption and weather conditions. Funding Statement: This work was supported by the Taif University Researchers Supporting Project Number (TURSP-2020/345), Taif University, Taif, Saudi Arabia.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.