A Hybrid Approach for Performance and Energy-Based Cost Prediction in Clouds

: With the striking rise in penetration of Cloud Computing, energy consumption is considered as one of the key cost factors that need to be man-aged within cloud providers’ infrastructures. Subsequently, recent approaches and strategies based on reactive and proactive methods have been developed for managing cloud computing resources, where the energy consumption and the operational costs are minimized. However, to make better cost decisions in these strategies, the performance and energy awareness should be supported at both Physical Machine (PM) and Virtual Machine (VM) levels. Therefore, in this paper, a novel hybrid approach is proposed, which jointly considered the prediction of performance variation, energy consumption and cost of heterogeneous VMs. This approach aims to integrate auto-scaling with live migration as well as maintain the expected level of service performance, in which the power consumption and resource usage are utilized for estimating the VMs’ total cost. Specifically, the service performance variation is handled by detecting the underloaded and overloaded PMs; thereby, the decision(s) is made in a cost-effective manner. Detailed testbed evaluation demonstrates that the proposed approach not only predicts the VMs workload and consumption of power but also estimates the overall cost of live migration and auto-scaling during service operation, with a high prediction accuracy on the basis of historical workload patterns.


models) have been developed by providers which can significantly affect the adoption of Cloud
Computing industry. However, these approaches are sophisticated and have a set of limitations because the customers are charged on the basis of pre-defined tariffs for the resources, even if they have not used them [6][7][8]. In addition, the variation of energy cost is not considered in these approaches [9,10]. With the increasing cost of electricity, energy consumption is considered as one of the major cost factors that has a great effect on the Cloud infrastructure's operational cost [1][2][3]11]. Therefore, building a novel cost approach for Cloud services that are adapted to the energy costs is challenging and has attracted the attention of many researchers [1][2][3].
The consumed energy at the clouds is dependent on two main factors, which are the physical resources' efficiency and the strategies employed to manage these resources [12,13]. Consequently, numerous reactive and proactive-based methods are used to manage Cloud resources in an efficient way such as dynamic consolidation and resource provisioning [14]. For example, when the workload exceeds a specific threshold, (i.e., 95% of CPU utilization), the right decision is made (e.g., VMs migration or resources scaling) to avoid the degradation of service performance. Indeed, proactive-based methods can make the corrective decisions (e.g., auto-scaling, live migration and re-allocation) at earlier stages, in which the violation of Service Level Agreement (SLA) is prevented, and the service performance is maintained acceptable. Therefore, studying and understanding the impact of these decisions is important to develop cost-efficient strategies and energy efficient resource allocation methods. Moreover, estimating future Cloud services' cost can help service providers for making effective-cost decisions and offering suitable services that meet the requirements of their customers. Furthermore, recent developments in the management of Cloud paradigm at different levels and the reduction of energy consumption have received attention and thereby reduce the operational expenditure (OPEX) costs for the Cloud providers.
The aim of this paper to overcome the identification's challenge of cost-effective approaches for cloud services by enabling energy consumption's awareness, performance variation and the virtual level cost in the environment of Cloud. In addition, the result of this work can be integrated with reactive and proactive resource management methods for making effective-cost decisions in which it is reinforced by performance and energy awareness to manage Cloud resources efficiently. Moreover, the consumed energy is reduced and then the total cost of Cloud providers is minimized while the service performance is maintained. The contributions reported in this paper can be summarized as follows: • A novel hybrid approach is introduced for predicting the VMs total cost, energy consumption and performance variation, where auto-scaling and live migration are integrated in order to provide cost-effective strategies. Additionally, the trade-off among cost, energy and performance in the cloud environment are considered. • Set of models and algorithms are proposed for enhancing VMs consolidation and resource provisioning techniques, as well as supporting energy consumption's awareness, performance variation and cost in the infrastructure of Cloud. • The evaluation results based on a real Cloud testbed proved the usability of the proposed approach and verified the capability of the prediction models.
The reminder of this paper is organized as follows: Section 2 introduces the hybrid approach for performance and predication cost of energy that integrates VMs auto-scaling with live migration. The experimental setup and design are presented in Section 3. This is followed by the evaluation and results discussion in Section 4. Finally, Section 5 concludes this paper and discusses future steps.

Integration of VMs Auto-Scaling with Live Migration: A Hybrid Approach
Recently, resource provisioning and VMs consolidation are used to address workload fluctuations issues, in the cloud environment. From on hand, the resource provisioning-based solutions (e.g., auto-scaling) can provide VMs with needed additional resource capacity to satisfy the requirements of Quality of Service (QoS). For instance, when one or more VMs are detected as overloaded (e.g., the workload surpasses the predefined percentage of upper threshold), the VMs should be scaled up/out to meet the application demands. Generally, VMs auto-scaling are categorized into two main types, namely, vertical scaling (i.e., scale-up/resizing) and horizontal scaling (i.e., scale-out) [15][16][17]. In the case of vertical scaling, the resources (e.g., vCPUs and memory) are added into VMs, while in the horizontal scaling case, an additional VMs are created. Note that, the additional resources are determined regarding the application requirements. Nevertheless, the second scaling type (i.e., horizontal scaling) requires a few minutes to be initiated [15,18,19], which is unacceptable for real-time and delay-sensitive computation [20,21]. Additionally, there are extra costs for vertical and horizontal scaling [17] which are scaling time (booting/rebooting), new VMs' license fees and energy overhead that need further consideration [16]. From the other hand, VM consolidation-based solutions (e.g., live migration) can improve the resource utilization and achieve energy efficiency in Clouds, in which the live migration process allows VMs to be moved from one PM to another without service interruption [22]. This approach plays an important role to balance the load among the PMs and reduces the overall energy consumption. For instance, when a host is detected as underloaded (e.g., the workload less than the predefined percentage of lower threshold), it is a candidate for being switched off or to enter power saving mode. However, the process of live migration is considered as a resource-intensive operation [23], in which the migrating VMs' service performance, and the services running on other VMs, are effected [24][25][26][27]. Moreover, it's needs to take attention that there are extra costs for the migration process [17], which includes the migration time and energy overhead [28,29]. Therefore, studying and understanding the influence of VMs auto-scaling and live migration is important to develop cost-effective strategies for Cloud services.
Numerous approaches for resource provisioning and VM consolidation independently emerged in the literature [16,17,21,24,25,[30][31][32] with the goal of balancing the load, increasing the capacity of VMs resources and reducing the energy-related costs. To minimize the operational costs while achieving performance objectives, Cloud providers may automatically perform VMs consolidation and resource provisioning to match workload changes and prevent any performance loss. Indeed, a proactive-based framework can make the preventive actions on-the-fly (e.g., VMs auto-scaling, migrating and re-allocating) at earlier stages, in which the degradation of service performance is avoided. In addition, the framework's effectiveness depends on possible actuators/decisions to be implemented at service operation. This solution would enable Cloud providers to better use of their infrastructure in terms of maintaining service performance, reducing power consumption and operating cost. Besides, estimating future Cloud services' cost supports the service providers for offering appropriate services that satisfy the requirements of their customers [33].
Therefore, the proposed framework in [34] has been extended to support a new hybrid approach for predicting the total cost of heterogeneous VMs, considering their energy consumption and performance variation, as depicted in Fig. 1. More specifically, the costs of auto-scaling and live migration processes are integrated to determine the decision, in which the issues related to quality characteristics (e.g., energy consumption and application performance) are considered.
In addition, Autoregressive Integrated Moving Average (ARIMA) model is utilized to predict the workload of PMs/VMs in order to handle the performance variation of the applications and perform the most effective decision(s) (e.g., auto-scaling, live migration or both). Moreover, regression models exploit VMs and PMs workload's correlation to predict the VMs power consumption for an efficient allocation/re-allocation of the VMs. Consequently, on the basis of the predicted workload and energy consumption for each VM, VMs' total cost caused by the most effective decision(s) can be estimated. Finally, to reach this goal, numerous steps are needed to detect PMs' workload (i.e., underloaded and overloaded), predict PMs/VMs workload and consumption of power, and thereby estimating the scaled/migrated VMs' total cost, that are described in detail in the following subsections.

PMs Performance Detection
Step 1: The threshold percentages of PMs' CPU utilization and RAM usage are determined (i.e., lower, upper and max_upper; 25%, 85% and 95%, respectively) and the workload of PMs is monitored periodically. Algorithm 1 shows the detailed process for detecting the underloaded and overloaded PMs. This Algorithm combines two sub-algorithms: 1) live migration with VMs re-allocation in order to switch the underloaded host to power saving mode, hence save energyrelated costs. Also, this mechanism aims to minimize the overall cost of migration by re-allocation the VMs to the most energy efficient host (if possible), as presented in Algorithm 2; and 2) an integration of auto-scaling, live migration and re-allocation in order to prevent the host to be overloaded. This mechanism would help to select the most cost-effective action(s) in order to minimize VMs' total cost that cased by scaling and migration decisions, as presented in Algorithm 3. The list of the algorithms notations and their definitions are summarized in Tab. 1.
Step 2: The underloaded PMs is detected through Algorithm 1 and then appropriate actions are made such as live migration and re-allocation to save cost. Therefore, if the PMi workload ( n i=1 VMs Workload) is less than or equals to the threshold lower percentage (e.g., 25%), then ARIMA model is utilized for predicting next time interval VMs' workload (e.g., every 5 min) on the basis of historical patterns of workload (see Step 4). Indeed, this process can detect the PMi with the underloaded workload in advance, and thereby can migrate the VMs and switch PMi to power saving mode. Afterward, if the next time interval VMs' predicted workload is still less than or equals to the lower threshold, then VMs live migration and re-allocation are performed using Algorithm 2.
Also, this algorithm (i.e., Algorithm 2) is used to select a matching destination PMj to host the migrated VMs, and to check whether the cost incurred by VMs live migration is less than the cost of switching the source PMi to power saving mode. To do so, PMs are descending sorted regarding their energy efficiency. This is aimed to migrate the VMs to the host with the most appropriated energy efficient. In this regard, the estimation of the energy efficiency for both host (i.e., source PMi and destination PMj) can be calculated as: PM Power . For example, if the PM power > 1, the destination host is more energy efficient than the source; if the PM power = 1, the destination host is similar to the source in terms of the energy efficient and if the PM power < 1, the destination host is less energy efficient than the source.
Starting with the lowest idle power for PMj (the most energy efficient host), and then check if PMj has enough resources to meet the migration requirements, while simultaneously ensuring that the destination host PMj does not exceed the upper percentage threshold after allocating the migrated VMs. This algorithm ensures: 1) the destination PMj is not overloaded after VMs migration process, 2) the source PMi will be switched to power saving mode once the migration takes place in order to save cost.
Step 3: The overloaded PMs is detected through Algorithm 1 and then the candidate VMs that need to be scaled/migrated are identified. Therefore, if the workload value of PMi within the range of [upper and max_upper percentage of threshold], then ARIMA model is utilized for predicting the next time interval of VMs workload in PMi (e.g., every 5 min) on the basis of historical patterns of workload (see Step 4). Obviously, this process can detect the PMi with the overloaded workload, and thereby can perform preventive actions such as VMs auto-scaling and live migration. Further, this mechanism would help to control the number of scaling and migrations decisions in order to prevent unnecessary scaling/migration incurred by small workload peaks (false alarm). Afterward, if the workload of predicted VMs is still within the range of [upper and max_upper percentage of threshold] for the next interval, VMs auto-scaling/live migration process is employed using Algorithm 3.
The proposed Algorithm 3 combines the auto-scaling (vertical/horizontal scaling) with live migration for obtaining the appropriated cost-effective decision(s). The overloaded VMs is firstly scaled/migrated (e.g., resize VMs, migrate existing VMs and resize them, or initiate new VMs), then select appropriate destination PMj for hosting it. To achieve this, the following set of conditions are tested in the same order and perform the associated actions: 1) if PMi (the source host) has enough resources to meet the scaling requirements, then the vertical scaling is performed on this PMi (hint: vertical scaling is restricted by PMi capacity [17,35,36]); 2) if PMi does not have enough resources, the PMs are descending sorted regarding their energy efficiency, as described in Step 2. After sorting the PMs based on their energy efficiency, the migration and vertical scaling decision is performed in order to firstly migrate the overloaded VMs to appropriate host PMj and then vertically scaling them. It also checks if PMj has enough resources to meet the migration and scaling requirements, while at the same time ensuring that the destination host PMj does not exceed the upper percentage threshold after allocating the migrated VMs along with their scaling requirements (the additional resources); otherwise, 3) horizontal scaling is made on PMj in a similar manner as the previous action by placing the new VM to an appropriate destination. This algorithm ensures: 1) the destination PMj is not overloaded after VMs scaling and migration processes, 2) workload of the source PMi is significantly decreased once scaling/migration process is performed, and 3) minimize VMs' total cost caused by scaling/migration decisions. For VMs scaling, Algorithm 4 is utilized to select the appropriate VMs size for cost-effective scaling on the basis of closest cloud provider-defined instance sizes set (e.g., small, medium and large VM). However, over-provisioning may be happened in this mechanism (e.g., in the case of there is a difference between the requested scaling resources and the predefined cloud providers' instance sizes). Subsequently, the extra-resources are wasted, where set of capacity is created and useless, and thereby more money might paid without any benefit [17,37], which is not the auto-scaling goal of VMs. Moreover, wasted resources may lead to an increase in energy costs because of their under-utilization and a decrease in the revenue of Cloud providers, as the number of resource requests acceptable may decrease. Therefore, motivated by these considerations, a new self-configuration algorithm is proposed for resizing/creating VMs on the basis of the right requested resources size, where the self-configuration algorithm aims to allocate adequate resources to VMs and avert resources' over-provisioning. In this way, this algorithm (i.e., Algorithm 4) can maximize cloud providers' resources usage and thereby their profits are maximized, and only the actually used resources will be paid by customers.

VMs Workload Prediction
Step 4: In this step, VMs workload is predicted for the next time interval using ARIMA model, which is expressed as the number of VMs requested and their capacity (i.e., vCPUs, memory, disk and network) for executing the application, and can be calculated based on historical workload patterns retrieved from a knowledge database. Different patterns of workload can be achieved in Cloud applications based on the behaviors of customers usage, which consume different power depending on the utilized resources. As stated in [38], there are several workload patterns such as static, periodic, continuously changing, unpredicted, and once-in-a-lifetime. The static workload pattern can be easily predicted, but there are many challenges that can obstruct the workload prediction when using other patterns. For example, other patterns may reflect temporary fluctuations of the workload such as continuously changing and once-in-a-lifetime or may be difficult to predict in advance such as the unpredicted pattern. These patterns do not necessarily occur in data centers on a daily basis [32]. Therefore, it is essential to have approximated workload patterns that occur in the time series history to achieve a high prediction accuracy [32]. Thus, the periodic workloads can be more appropriate and precise to allow Cloud services to quickly scaling or descaling the capacity to meet demand and dynamically control the cost of the infrastructure. Therefore, the simulated periodic workload pattern is considered for the historical data to be used in this work. Due to the sophistication and ARIMA model accuracy, it is widely used in different fields, such as economics and finance [39], where a further details exist in [39]. After predicting the workload of VMs using a model of ARIMA, which is based on historical data, the workload of PMs as well as the power consumption of PMs/VMs are predicted using regression models.

PMs Workload Prediction
Step 5: Workload prediction of PMs is represented as (PMs CPU utilization), which can be calculated based on the relationship between vCPUs number and the PM CPU utilization for the heterogeneous PMs, as shown in Figs. 2-4, respectively.
In this step, a model of linear regression is utilized for predicting the PMs CPU utilization on the basis of the ratio of used vCPUs requested number by VMs, considering their current workload as the PM can already running other VMs [40,41]. The following Eq. (1) is used.
where PMx PredUtil indicates the predicted CPU utilization of PM; α and β denote the slope and the CPU utilization's intercept, VMy ReqvCPUs and VMy PredUtil denote the number of vCPU requested and the predicted utilization for each VM. The PMx CurrUtil and PMx IdleUtil denote the current and idle PM utilization, respectively.

PMs Energy Consumption Prediction
Step 6: After the workload of PMs was predicted in the previous step, the power consumption of PMs is predicted in this step, which can be computed on the basis of the relationship between the predicted workload (CPU utilization) and power consumption of PM. Therefore, the considered PM should be characterized using regression models with respect to the correlation of their power consumption and CPU utilization, as shown in Fig. 5. Consequently, the power consumption of the predicted PM, PMx PredPwr measured in Watt, can be linearly expressed using the CPU utilization of the predicted PMs, as shown in Fig. 5 and in Eq. (2). Where α and β denote the values of slope and interceptor obtained using regression relation, and PMx PredUtil denotes CPU utilization of the predicted PM.
However, all existing PMs do not necessarily follow this linearity relation due to the natural heterogeneity of PMs' resources, as shown for example in Figs. 6 and 7. Therefore, in this case, the correlation among power consumption and the targeted PM's CPU utilization can be characterized using other regression models, such as polynomial, as presented in Eq. (3).
where α, γ and ϕ represent slopes, β denotes the intercept and PMx PredUtil denotes the CPU utilization of the predicted PM.

VMs Energy Consumption Prediction
Step 7: The goal of this step is to assign the PM predicted consumption of power to the newly requested VM, VMx PredPwr , as well as to the running VMs on the physical host, which can be computed using Eq. (4).
where VMx PredPwr denotes one VM's predicted power consumption in Watt. The VMx ReqvCPUs and VMx PredUtil denote the vCPU requested number and the predicted CPU utilization of VM, respectively. VMcount y=1 VMy ReqvCPUs denotes VM's total vCPU on the same PM. The PMx IdlePwr and PMx PredPwr denote the idle and predicted consumption of power for a single PM.
Note that, the energy providers are usually charged by Kilowatt per hour (kWh). Therefore, the consumption of power needs to be converted to energy based on the following Eq. (5).

VMs Total Cost Estimation
Step 8: Finally, regarding the obtained values of predicted VMx resource usage and energy consumption in Steps 4 and 7, the VMx total cost is estimated in this step. In the case of VM migration, the total time Time s needed to migrate VMx can be expressed as follows: where T Mig denotes the total migration time of VMx in seconds. T Mig_Start and T Mig_End denote the start and end time of the migration process. T Run_Sou denotes the summation of VMx running time on the PMi before the migration process starts plus the migration time T Mig itself, and T Run_Sou_Bef _Mig denotes the VMx running time before migration. T Run_Des denotes the VMx running time on the PMj during and after the migration process, and T Run_Des_Aft_Mig denotes the VMx running time after migration. In the case of VM auto-scaling, the total time Time s needed to scale VMx can be expressed as follows: where T Scaling_VMx denotes the VMx scaling time in seconds. T Start_Scaling and T End_Scaling denote the start and end time of scaling. T Existing_VMx denotes the VMx running time before the scaling process starts. T Start_Run and T End_Run denote the start and end time for the operating task. For estimating the total cost of VMx based on the suitable action(s), Eq. (11) is proposed.
where VMx Est_Cost_PMi denotes the VMx total estimated cost before and during the action is made on the source PMi. The VMx ReqvCPUs_PMi denotes the vCPUs requested number for each VM and VMx Pred_U_PMi denotes the predicted CPU utilization for each VM, times vCPUs requested cost for a period of time (the time is based on the performed action). VMx Pred_RAM_U_PMi denotes the predicted usage of memory, times the cost for that resource for a period of time. The notation of disk and network resources are similarly considered. VMx Pred_Energy_PMi denotes the VMx predicted energy consumption, times the energy providers-considered energy cost. Therefore, Eq. (11) can be used to estimate the VMx cost during and after the action is made at the destination PMj, while the resources of PMi (e.g., CPU, RAM, disk, network and energy) are substituted with PMj resources. Moreover, an additional license fee α is added for the new VM (i.e., constant £0.1/h.) in the case of making horizontal scaling decision. Thus, Eq. (12) can be used to obtain the VMx estimated cost, VMx Total_Est_Cost , before and after the action is made.

Experimental Setup
In this section, the experimental details to critically evaluate the performance of the proposed approach is conducted, in which a real Cloud testbed is used. First, the cloud testbed and monitoring infrastructure are introduced. Afterward, experimental design is presented.

Cloud Testbed and Monitoring Infrastructure
The experiments are undertaken on a Cloud testbed composed of commodity Dell servers' cluster, in which each server is equipped with 16 GB RAM and 500 GB of SATA HDD storage, running Linux CentOS version 6.6 platform. In addition, four different PMs are considered on the cloud testbed, where three of them (Host A, C and D) equipped with an Intel Xeon CPU with four core X3430 and the last one (Host B) equipped with an Intel Xeon CPU with eight-core E3-1230 V2. Also, the testbed offers a share of Network File System (NFS), which runs on the cluster head node and provides with a 2 TB of VM images' storage. Moreover, this testbed is supported by Virtual Machine Manager (VMM) with OpenNebula [42]  In addition, Cloud testbed monitors and measures the usage of resources and energy. At the physical level of host, each PM is equipped with a WattsUp meter [44], for measuring the power consumption values per second, which then pushed to Zabbix (infrastructure monitoring tool) [45]. Besides, for each of the running PMs and VMs, the usage of resources (e.g., CPU, memory, network and disk) are monitored by Zabbix. Moreover, the power consumption of PMs and resource usage of VMs are transferred together to the proposed approach, which can measure the energy consumption as well as VMs' total cost. Finally, with regard to VMs considered in this paper, Rackspace [46] is utilized as VMs configurations reference, to provide an extensive range of VM types, that allows customers plenty of flexibility to satisfy their needs. Additionally, in this paper, three types of VMs (i.e., small, medium and large) with different capacities are utilized. Furthermore, with regard to Elasti-cHosts [47] and VMware [48], the virtual resources' cost is determined, whereas they describe a service cost breakdown in detail as follows: 1 vCPU, 1 GB Memory, 1 GB Storage and 1 GB Network are respectively assigned £0.008/h, = £0.016/h, = £0.0001/h, = £0.0001/h; and energy cost = £0.14/kWh [49]. Note that customers are charged based on the resources they utilize in British Pound Sterling (GBP/£).

Design of Experiments
In this subsection, an experimental design is conducted to prove the capability of the proposed approach to detect and predict the underloaded/overloaded PMs in order to handle the service performance variation in a cost-effective manner. This approach also aims for predicting the workload and power consumption and can estimate the total cost associated with migrated and scaled VMs in the case of heterogeneous PMs are used for running. Additionally, the approach focuses on total saving cost which can be achieved if VMs are migrated/scaled to/on different hosts with different energy characteristics. In the experimental design, a real pattern of workload for Cloud applications is represented by generating historical data, in which all the resources (i.e., CPU, memory, network and disk) are stressed to their full utilization on different VMs types using the Stress-ng tool [50]. Then, guided by the intuition in [51], the time interval for each VM type's generated workload is divided into four slots (30 min each), in which three of them are used a set of prediction historical data, whereas the last one can be utilized as a set of testing data for evaluating the predicted results. The applying process of the proposed approach is as follows. Firstly, the underloaded and overloaded hosts are detected in order to handle the service performance variation. Then, the auto.arima R package's function [52] is used to predict the workload of VMs, in which ARIMA best fit model is automatically selected on the basis of Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC) value. Afterwards, the process passes through the framework cycle [34] and takes into account the relationship among the physical and virtual resources for predicting VMs' consumption of power, in the case of several PMs are used for running. Finally, the most cost-effective decision(s) is performed, and the total cost is estimated for both migrated and scaled VMs on the basis of their power consumption and workload predicted.

Evaluation and Results Discussion
In this section, an extensive and quantitative evaluation of the proposed approach based on the performance and predication cost of energy are presented, in which VMs total cost during service operation is estimated with the consideration of migration and scaling decisions. First, the predicted workload results for VMs with three different types (i.e., small, medium and large) that operated on several PMs on the basis of historical periodic workload pattern, are introduced. Afterward, the results of VMs power consumption prediction with live migration and auto-scaling processes are discussed. Finally, the estimation cost of VMs is presented.

VMs Workload Prediction
Regarding Algorithm 1, the predictive model is employed to reduce the number of VM migration/scaling decisions in the case of PMi is underloaded/overloaded, while the predefined thresholds (i.e., lower, upper and max_upper) are satisfied, and thereby prevent the unnecessary migration and scaling incurred by small workload peaks (false alarm). Generally, with regard to Algorithm 2, VMs are migrated or re-allocated/allocated to the appropriate destination PMj (i.e., with sufficient resources and energy-efficient) in the case of PMi is underloaded, in which the predicted workload is less than or equals to the predefined lower threshold, and thereby PMi is switched to power saving mode and then the cost is saves. Furthermore, according to Algorithm 3, the most cost-effective scaling/migration decision(s) (e.g., resize VMs, migrate existing VMs and resize, or initiate new VMs) is performed and VMs are re-allocated/allocated to the selected destination PMj (i.e., with sufficient resources and energy-efficient) in the case of PMi is overloaded, where the workload predicted is with the range of [upper and max_upper percentage of threshold]. Also, the VMs predicted results in relation to the actual workload are illustrated in Figs. 8-10, where CPU, RAM, disk, and network usage are included. It is observed from the figures that the predicted and actual workload results of VMs for CPU, RAM and network are approximately matched where the utilization peaks are periodic. This is due to the fact that the ARIMA model is capable of captures the trend of historical seasonal and therefore gives an accurate prediction. Whereas the accuracy for predicted disk workload is less with respect to the results of CPU, network and RAM prediction. This is trace to the high variations of disk workload pattern of the generated historical periodic which is not nearly matched in each interval. Besides the mean values of VMs predicted workload, the results also illustrated the confidence intervals of each VM predicted workload with low and high percentage (i.e., 80% and 95%) based on the ARIMA model.

These metrics include, Mean Error (ME), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percent Error (MAPE), and Mean Percentage Error (MPE).
The first metric (ME) is used to measure the predicted average error, whereas the measured variance square root with the average absolute error is measured by (RMSE). Moreover, (MAE) is utilized to compute the average absolute value for the predicted and actual value's difference. Finally, (MPE) is used to compute the average of percentage variation errors between the predicted and actual values, while (MAPE) is used to compute the average absolute value for the predicted actual value's difference, which is explained as the actual value percentage [53]. When the values of these metrics are too low or close to zero, it indicates that the prediction method has achieved very high prediction accuracy.

VMs Power Consumption Prediction
Besides the VMs workload prediction, the power consumption is predicted using the proposed approach for different number of VMs, while PMi and PMj, source and destination, are running for both live migration and auto-scaling, as described next.

VMs Live Migration Power Consumption Prediction
VMs predicted results in relation to actual value of power consumption is shown in Figs. 11-13, in which different VMs are running on source PMi (Host A) and destination PMj. Note that, the destination PMj can be the most energy efficient (Host B), which is similar to the configuration of the source host (Host C) or with less energy efficient (Host D) comparing to source PMi, all of which were based on the migration decision. According to Algorithm 2, the migration is performed for the underloaded PMi if the selected destination PMj has enough resources and the upper value of threshold is not exceeded once the VMs migration takes place. Also, it is worth mentioning that a change in the PM predicated value CMC, 2021, vol.68, no.3 3549 of CPU utilization will influence the value of predicted power consumption of all VMs. Meanwhile, Tab. 3 presents different number of metrics used to verify the accuracy of the prediction of power consumption using three types of VMs (i.e., small, medium and large) based on a periodic workload pattern.

VMs Auto-scaling Power Consumption Prediction
Figs. 14-22 show the predicted results in relation to the actual value of VMs' power consumption that are operating on different hosts using different techniques (vertical scaling, migration and vertically scaling and horizontal scaling).     According to Algorithm 3, the vertical scaling is performed for the overloaded VMs on the same host, if the host has enough resources to meet the scaling requirements (PMi capacity is limited by vertical scaling). Otherwise, the VMs migration and vertically scaling or the horizontal scaling are performed for the overloaded VMs on a number of hosts, e.g., (Host B) is the most energy efficient, (Host C) has a configuration similarity to the source host (Host A), and (Host D) is the less energy efficient. Note that, the vertical scaling was not performed for all types of VMs (e.g., large VM), since the large VM has four CPU cores and the capacity of the hosted PM (e.g., on Host A, Host C or Host D) has the same number of CPU cores as the VM. Thus, vertical scaling on the same host is not available. Therefore, only horizontal scaling can be performed with this type of VM. On the other hand, the vertical scaling or migration and vertically scaling can be performed for the large VM only on (Host B), since it has eight CPU cores.
Meanwhile, Tabs. 4-6 present different number of metrics used to verify the accuracy of the prediction of power consumption using three types of VMs (i.e., small, medium and large).

VMs Total Cost Estimation
The proposed approach is also able to estimate VMs' total cost of live migration and autoscaling when they are running on different PMs. Fig. 23 shows the results of the estimated VMs' total cost before live migration on (Host A) and after live migration takes place on (Host B, Host C and Host D) along with their migration cost. According to Algs. 1 and 2, the migration is performed for the underloaded PMi only if the cost of VMs incurred by live migration to the selected destination PMj is less than the cost of switching the source PMi to power saving mode. In this case, only small VMs can meet these conditions to be migrated to the selected destination PMj, if it is the only VM running on the source PMi. In addition, Fig. 24 presents the estimated cost saving results, which can be achieved for the small VM when being migrated to different hosts. This power savings cost is gained by switching the source (Host A) to power saving mode. For example, when the VM is migrated to the most energy efficient host (Host B), it can achieve approximately 100% cost saving which means the cost that can be saved by switching PMi to power saving mode minus the cost incurred by live migration decision. With a similar host configuration to the source (Host C), it can achieve around 76% and with the less energy efficient host (Host D), it can achieve about 27%. Further, hosts' energy efficiency plays a major role in reducing overall energy consumption. Thus, selecting the appropriate hosts to migrate the VMs have a great effect on the overall cost saving (e.g., migrating the VMs to most energy efficient PM (Host B) is more cost-efficient than migrating the VMs to less energy efficient PM (Host D)).

VMs Auto-Scaling/Migration Cost Estimation
Figs. 25-27 show the results of the estimated total cost for three VMs types operating on multiple PMs using different scaling/migration strategies. According to Algorithms 3 and 4, the scaling of the VMs will be respected to the requested resources' right size using the selfconfiguration technique. Thus, the vertical scaling is performed on the same host, if the host has enough resources to meet the scaling requirements. Otherwise, the VMs migration and vertically scaling or the horizontal scaling are performed on a number of hosts, e.g., (Host B) is the most energy efficient, (Host C) is a host with a similarity configuration to the source host (Host A), and (Host D) is the less energy efficient. As mentioned earlier, the vertical scaling was not performed with the large VM, since it has the same number of CPU cores as the hosted PM (e.g., on Host A, Host C or Host D), which means that the host is fully utilized via the VM. However, the vertical scaling can be performed for the large VM only on (Host B) that has eight cores, as shown in Figs. 25 and 26, respectively.  Choosing between different scaling strategies have a great effect on the scaled VMs' cost (e.g., vertical scaling can be more cost-efficient than the proposed migration and vertically scaling or the horizontal scaling when the VMs are scaled on the host with a similar configuration), as shown in Figs. 25-27, respectively. This can be justified because vertical scaling has no additional costs in terms of migration cost (e.g., in the case of migration and vertically scaling) or software license for new VMs [36] (e.g., in the case of horizontal scaling). However, the technique of vertical scaling is limited to the host's capacity [15,36]. Therefore, the proposed migration and vertically scaling mechanism helps for selecting the most appropriate cost-effective scaling strategy, rather than just only choosing between scaling up/out. As shown in Figs. 26 and 27, the proposed migration and vertically scaling mechanism outperforms the horizontal scaling one. This can be justified because of the additional cost in terms of new VMs' software license in the case of performing horizontal scaling is larger than the cost of live migration for the VMs when migration and vertically scaling is performed. Furthermore, selecting the appropriate hosts in terms of their energy efficiency to scale the VMs have a great effect on the scaled VMs' total cost (e.g., it is a cost-efficient decision if a PM with most energy efficient is used for horizontal scaling rather than a PM with less energy efficient). Although the workload utilization with high variation, the metrics of accuracy show that the VMs predicted workload and consumption of power are accurately predicted as well as the live migration and auto-scaling's estimated cost. Thus, the cost decision of cloud providers can be enhanced using the proposed approach in terms of selecting the most suitable cost-efficient migration and scaling techniques.

Conclusion and Future Work
This paper has proposed and verified a new hybrid approach for performance and predication cost of energy. This approach can support decision-making dynamically, regarding auto-scaling and live migration costs, while simultaneously aware of the energy consumption effect and application performance during service operation. This hybrid approach integrates auto-scaling with live migration in order to estimate the total cost of heterogeneous VMs by considering their resource usage and power consumption, while maintaining the expected level of application performance. The overall results show that the proposed hybrid approach can detect the underloaded and overloaded hosts to perform the most cost-effective decision(s) to handle the service performance variation. It can also accurately predict the workload and power consumption as well as estimate the total cost for both migrated and scaled VMs when operating on different PMs, on the basis of historical workload patterns.
In ongoing and future work, this approach will be extended to consider the impact of hardware accelerators, such as Graphic Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs) on the energy consumption and service performance. This extension would be useful when modelling and identifying the energy consumption and total cost of Cloud services. Furthermore, it is hard and costly to conduct large-scale experiments in a real Cloud environment (e.g., a Cloud testbed), especially with limited resources. Therefore, simulation can be considered to further study the scalability-related issues. Finally, additional Cloud applications workload patterns such as unpredictable and continuously changing as well as different prediction algorithms such as Deep Neural Network (DNN), can be further considered as an expansion of the scope for the workload prediction, power consumption prediction and the VMs' cost estimation on the basis of different types of workload patterns.