Allocation and Migration of Virtual Machines Using Machine Learning

: Cloud computing promises the advent of a new era of service boosted by means of virtualization technology. The process of virtualization means creation of virtual infrastructure, devices, servers and computing resources needed to deploy an application smoothly. This extensively practiced technology involves selecting an efficient Virtual Machine (VM) to complete the task by transferring applications from Physical Machines (PM) to VM or from VM to VM. The whole process is very challenging not only in terms of computation but also in terms of energy and memory. This research paper presents an energy aware VM allocation and migration approach to meet the challenges faced by the growing number of cloud data centres. Machine Learning (ML) based Artificial Bee Colony (ABC) is used to rank the VM with respect to the load while considering the energy efficiency as a crucial parameter. The most efficient virtual machines are further selected and thus depending on the dynamics of the load and energy, applications are migrated from one VM to another. The simulationanalysis is performed in Matlab and it shows that this research work results in more reduction in energy consumption as compared to existing studies.


Introduction
Cloud computing offer tremendous opportunities to IT sector for outsourcing computing resources and services. It is one of the most renowned service in the technological world in which Service Providers (SP) and Service Users (SU) are the key elements. It grants vast computing power to support numerous web-based services while providing large storage space. NIST (National Institute of Standards and Technology) had defined cloud computing as a model to deliver a convenient network to pool a number of computing services rapidly with minimal management effort required at the service provider end [1,2]. In cloud computing everything is provided in the form of service. The services fitted in these models, namely Infrastructure-as-a-Service (IaaS) [3,4] Platform-as-a-Service (PaaS) [5,6] and Software-as-a-Service (SaaS) [7,8]. As the cloud is becoming a conventional practice, the cloud storage is also becoming affordable and cheaper [9]. The recent time has shown emergence of Google, Microsoft, Amazon and IBM as some of the key players offering range of cloud services to their customers. However, an efficient VM utilization is required to deliver a consistent, flexible, interoperable and cost effective service to SUs [10,11].
Here, virtualization is the main technology that has been used to design energy efficient data centre. A generalized VM allocation and migration scheme is illustrated in Fig. 1 [12,13]. The idea here is to migrate the running VMs from heavily loaded nodes to moderately loaded nodes in an energy efficient manner. In this process, the VMs can be created, deleted or migrated depending upon the workload, request batching, task scheduling, kind of service selected and location of the clouds (local or remote). The creation of virtual environment involves installation of multiple VMs on a single physical node. With the increase in user demand, cloud SPs scale up additional VMs to satisfy the quality of services to SUs. The concept of live migrations was evaluated to offer dynamic and transparent migrations of VMs between hosts to identify the best target. Further, active VM servers prevent the resource under-utilization by minimizing the resource idle time. In addition to this, VM migration helps in the identification of the hot-spots in the data centres and eventually leads to green cloud data centres. The virtualization process has significantly resulted in the reduction of overall power or energy consumption. These results are obtained when same set of tasks is completed with and without involvement of virtualization. This shows that virtualized ICT requires less energy as compared to non-visualized ICT.  Figure 1: VM allocation and migration scheme [12] It is an important consideration that an over utilized PM may degrade the service quality while an underutilized PM causes wastage of resources. Therefore, an efficient utilization of the resources requires better resource management in terms of machine allocation and migration. VM auto scaling approach significantly helps in estimation of resource utilization to predict the future workload. However, achieving energy optimization through VM migration may resolve the issue to some extent but it requires manual selection of number of nodes in the input and output layer [14]. Another possible option is integration of bio-inspired algorithms to find a near optimal solution for helping the cloud data centre [15]. The meta-heuristics is the primary procedure to resolve the optimization issues of diverse fields including cloud computing [16]. In recent times, a large number of meta-heuristic approaches have been integrated to identify the optimal solution in terms of VM [17,18].
These approaches have been proved to be powerful tools to find an optimal solution in both local and global search space [19]. Therefore, authors are motivated to present a VM allocation and migration framework integrated with bio-inspired algorithms that could successfully identify optimal VM to perform the task with minimal energy consumption.

Contribution of the Work
The major contribution of the work is to perform an energy efficient live VM migrations between active nodes based on meta-heuristic technique. It also identifies the VM migration that requires minimal energy expenditure without compromising with quality of services. The work is also done to minimize the number of migrations required to complete the task.
The list of abbreviations used in the paper are summarized in Tab. 1.

Organization of the Paper
The rest of the paper is structured in five sections. Section 2 discusses the state-of-art work available for improvement in energy efficiency of cloud data centres with the concept of VMs, virtual machine migration and Swarm Intelligence algorithm. Section 3 represents the proposed algorithm and Section 4 discusses the results. The research work is concluded in Section 5 followed by the references cited in the paper.

Literature Review
The most important aspect of energy efficient cloud computing is successful deployment of large number of applications with least power consumption [20]. To address this, a lot of research work is being conducted in the area of cloud computing to reduce the power consumption in the data centre [21][22][23][24]. Recently, a technique was proposed to minimize the power wastage by minimizing the number of VM migrations between VMs and PMs. It significantly reduced the overall energy consumption [25]. In literatures a number of evolutionary approaches have been discussed to address VM allocation issues. Genetic algorithms are also implemented to resolve the challenges of cloud data center. Experimental analysis is performed using Family Gene approach and it results in significant reduction in the migration rate and the energy consumption [26]. Further to conserve energy and maintain the optimal performance, the bio-inspired techniques are used to migrate the extremely loaded VM to the least loaded active node. Simulation analysis using CloudSim demonstrated a reduction of 72.34% in VM migrations with an energy conservation of 44.39%. The selection of VMs is a wiser consideration that guarantees that the demand for computational resources and the performance are accomplished side-by-side. A dynamic redesigned VM allocation algorithm was evaluated against BitBrains, PlanetLab and Google Cluster work load data. The performance analysis took into consideration both allocated RAM and CPU utilization that is further evaluated for VM migrations, host shut downs and energy consumption against virtualization migration with abstract and static threshold. This ecofriendly best fit technique resulted in reduced number of migrations with lesser SLA violations. However, VM selection could be improved with the involvement of fuzzy algorithm with unique selection criteria [27].
The key challenge is the increased amount of energy consumption by the cloud computing data centres. To address this, a strategy "PPRGear" based on the sampling of energy utilization levels was put forward. They computed the energy utilization by measuring the number of serverside operations that were completed in a unit time to the power consumption. This was followed by VM allocation and migration based on optimal balanced created between energy consumption and host utilization. The strategy demonstrated a reduction in energy consumption by 69.31% with fewer instances of shutdown time and performance degradation of cloud data centres [28]. The existing TESA results in too many migrations that ultimately degraded the overall performance. Therefore, limitations of TESA were improved with the involvement of ACO. The idea here was to apply VM placement to the host that results in reducing the number of migrations. This approach also demonstrated reduced number of SLA violations [29]. Live VM migrations powered by virtualization technology was addressed with the implementation of micro-genetic algorithm to avoid overutilization of hosts. An analysis against traditional GA and PABFD demonstrated significant reduction in terms of SLA violations, number of utilized hosts with power consumption between 150.02kWh and 163.05kWh under different scenarios [30]. A meta-heuristic approach AntPu resulted in dynamically placing the VMs inside PM in order to optimally utilize the resources with minimum SLA violations in cloud based data centres [31].
The investigation of the existing work inferred that most of the researchers had focused on the power management and energy conservation through virtualization technology. The rising diversity and complexity of the data centres in terms of heterogeneity of devices, diverse OS and the requirement of high performance necessitates the deployment of an architecture that could tackle all these challenges. Further, research community is also observed to have significantly attraction towards the biologically inspired computing approaches. These approaches are based on the challenges encountered by the biological species to resolve the living challenges in terms of evolution, self-repair, self-organization, navigation, dynamic movements despite the presence of incredible diversity. The lessons learned from the biological systems are implemented as ACO, ABC, CS, PSO, and FFA to address the issues of cloud computing environment.
The inspiration drawn from these works motivates authors to integrate ABC that is strengthened with the biological behaviour of bees for searching food and CS that is based on the egg laying behaviour of cuckoos for rearing their offspring in the present work. The selection criteria employed here is the coverage of large search space with the division of labour by the bees in minimal time steps and the high coverage speed and global optimization attained by cuckoos. Further, the concept entails the utilization of minimal power or energy resources by selection of best VM for migration. It minimizes the number of VM migrations required to achieve energy efficient scenario [32,33].

Virtual Machine Migration
In essence, virtualization generates an illusory image or version, e.g., server, operating system (OS), which saves media or network resources to be used simultaneously on different machines. The prime objective of virtualization is to handle the workload in a more scalable, efficient and economical way. One can enhance the use of user-accessible resources by using virtualization to have more benefits. The most important benefits are separation among users, sharing of resources and gathering of resources. Virtual machine migration architecture is shown in Fig. 2. Virtualization helps to split one physical machine into many virtual machines (VMs) that operate simultaneously and have the same physical resources. In an individual physical machine, there is a possibility of using a number of VMs. The load should be transferred to the other machine in the event of physical host overload. The process of transferring load from one machine to another is known as 'migration' There are certain issues which are related to VM allocation and migration. The allocation and migration policy depends majorly on two factors namely the selection of the PM from where the VMs are to be migrated and then selection of the PMs where the VMs are to migrated. The PMs which are found to be over-utilized in terms of CPU utilization, are referred to as hotspot PM and the PM where the VMs are to be migrated are called target PM. In order to select the VMs from the hotspot PM, and to select the target PM, the usage of Swarm Intelligence (SI) has been observed quite often. Firefly algorithm is also used for the selection of the target VM whereas the selection of the VM from the hotspot PM has been done using computational load and its performance against the load. The same set of authors have also tried their hand with Artificial Bee Colony algorithm for the selection of VMs for a set of jobs supplied by users [34]. The allocation and migration procedure are shown in Fig. 3.
Step 1: VM checks with every PM whether the PM can full fill resource demand or not Step 2: Compute the power consumption at each PM if the VM is to be allocated Step  Figure 3: The allocation and migration procedure

Artificial Bee Colony (ABC)
The ABC algorithm is considered as a swarm-based meta-heuristic algorithm and this algorithm consists of significant elements, such as, employed, unemployed foragers with the food sources. ABC comprises with sets of artificial bees such as employed, onlooker with the scout bees. The initial part has employed bees whereas remaining part of the colony is employed by onlooker bees. The employed bees are connected to precise food source. The onlooker bees analyse the employee bee's dance in the hive for choosing the food source. The scout bees find the new food source in random way. The employed bee is the food collector bee. It searches the food and passes on the food to the Onlooker Bee for the checking and variations. The scout bee is mainly the resting bee and also termed as an unemployed bee. It is a non-working element in the proposed case. The sensitive labels obtained in the last step are passed as input to the ABC algorithm. The intelligence characteristic of the honeybees inspires the ABC algorithm. This bio-inspired algorithm exhibits an admirable search ability to deal with more complicated problems [35].

Proposed Algorithm
Over utilization is one of the popular concepts from the early days of cloud computing [36]. The proposed algorithm uses this concept of over-utilization. The algorithm uses Modified Best Fit Decreasing (MBFD) method for the allocation of the VMs over the PMs at the first glance. When it comes for the hotspot selection, the underutilized PMs loses all their VMs and the overutilized PMs lose some of VMs. The proposed algorithm views it as a problem of identifying maximum Rejection-probability (Rp) of the Bees (Bs) from hives 'h' where h is always equal to total number of over-utilized PMs. The proposed algorithm is divided into three stages as illustrated below.

Stage 1
Stage 1 aims to identify all possible hive elements for 'h' number of hives. Stage 1 simply utilizes the concept of Minimization of Migrations (MM) in order to find over-utilized VMs over an Identified-hotspot (Ih). If the VM is underutilized as compared to other VMs in the same list, it will be migrated first. In order to be precise on the order, the underutilized VMs will even be migrated as per the descending order of CPU utilization. After every possible migration, the 'overload' parameter which is again the CPU utilization of Ih is evaluated. If the CPU utilization is normalized and falls under non-hotspot PMs, the Ih is removed from the list of being a possible hive where migration has to take place.

Stage 2
Stage 2 evaluates the Remaining Ih (RIh) and selects the VMs as per the designed onlooker bee selection procedure. The working mechanism of (Energy aware ABC) EA-ABC is shown in Fig. 4. Step 1: Step 2: Cpo n (2) where Cpo is the consumed power by other VMs of other hives with nearly same MIPS.
Step 3: where MUTo is the most idol time spent by other VMs of other hives with nearly same MIPS.  Step 5: The bee food value(bfv) value will also be decided here.
Step 6: Initiate first fly variation(fv) by calculating average of every metric in Bf and by varying it with the difference of maximum value of each attribute and minimum value of each attribute and further multiplied it with random value which varies from 0-1.
where vmcount is the total number of vm in the current hive under Ih, rnd is the random value between 0-1, k is total amount of hives.
Step 8. There are 4 flies, and hence a Pre-Judgement Metric(P-JM) would be created containing 4 * vmcount [Fv] number of values and so as f-onlooker value for each bee and every fly position.
Step 8. Calculate average threshold(at) of each fly position.
Step 10. For every position in the fly table, if bfv of current fly is greater than that of at, mark 1 in jm for current fly.
Step 11. After the completion of all the flies, calculate 1 s and 0 s for every VM in the hive. If 1 s are greater than 0 s, the VM has been under major load and still performed well. It means, the VM needs some change to avoid over-utilization and should be migrated if required.

Stage 3
The third stage evaluates the over migration performance of the VM from the hotspot PM to the target PM for at-least 100 number of migrations during the simulation. The migration count along with the Bf is passed to Support Vector Machine (SVM) to understand the pattern of overload and underload. The trained sample is used to support EA-ABC to perform better in Stage 2.
SVM is a classification algorithm which extends the capabilities of Machine Learning (ML) based on the selection method which is also known as kernel functions [37,38]. It creates a repository which incorporates two types of categories namely Above Post Performance (APP) and Below Post Performance (BPP). The neutralization parameter would be Bf and the neutralization method would be similar to that of methods explained in Stage 2. When the system learns from the real time experience for the first 1000 simulations, a 25% decision support is taken by SVM.

Results and Discussion
The experimental analysis of the proposed technique to address the energy challenges of the cloud centre is performed using Matlab. The simulation study is performed to illustrate actual behaviour of the algorithm. The parametric description including the specification details is given in Tab. 2. The energy consumed with reduction in the number of VM migrations is the major parameter that is analysed in this research work. The comparative analysis in terms of variation in the number of hosts utilized and the number of migrations against a rise in the number of VMs is listed in Tab. 3. The comparative study involves comparison of this research work that is based on the two techniques namely, ABC and CS against two existing works of Kansal et al. [39] and Naik et al. [40].
The tracking of the number of active hosts is very important in order to prevent situations in which the hosts sit idle and consume unnecessary energy. Therefore, this research work is first compared for the number of active hosts against two existing works. The implementation of the optimization architecture results in significant decrease in the number of active hosts. It is observed that with an increase in the number of VMs, the number of hosts and number of migrations also increases. The improvement analysis illustrated in Fig. 6 shows that there is 10% to 30% improvement for number of hosts observed over 10 to 300 VMs. This is because ABC and CS chose more accurate hosts for VM allocation with reduced discovery time owing to their fast coverage speed to identify global optimization.
This improvement in the resource utility further results in reduction in the number of VM migrations that in turn avoids any wastefulness of energy resources. The change in the number of migrations with respect to number of VMs is also summarized in Tab. 3. It is observed that increase in the number of VMs results in increase in the number of migrations. However, the number of migrations required with the implementation of this work are much less as compared to existing studies. The improvement observed for this research work in terms of VM migration is illustrated in Fig. 7. The simulation analysis shows that the EA-ABC +SVM requires less VM migrations in contrast to implementation of other existing studies. This further reflects the improved ability to identify and allocate the best VM in order to minimize the number of migrations. The ability of the proposed EA-ABC and SVM to actively allocate the best VM is further evaluated for energy consumption and the number of SLA violations. Tab. 4 summarizes the variation in the energy consumption and the SLA violations observed against rise in the number of VMs. It is observed that the overall energy consumption is highly affected with a rise in the number of VMs and their migrations. Improvement analysis shown in Figs. 8 and 9 illustrates this research work as a highly optimized VM allocation and migration architecture without compromising the energy resources and exhibits least SLA violations.

Conclusion
This paper presents a novel algorithm in order to minimize the total number of migrations by selecting the VMs to be migrated more accurately using Machine Learning (ML) supported Swarm Intelligence. It further leads to eventually save energy consumption due to over-utilization of PMs. A new algorithmic architecture for the enhancement of Artificial Bee Colony (ABC) is introduced in the paper and is named as Energy Aware-ABC. To further investigate the performance of the VMs, a record of set of parameters is kept in a record set which is further supplied to train SVM for good and bad migrations. This algorithm is investigated with other set of similar structure algorithms and found to be better in comparison to other state of art techniques. The limitation of the work is a resource constraint in the cloud data centre which is to be accessed by the cloud broker.
Funding Statement: This research received no specific grant from any funding agency in the public or commercial sector.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.