Solving the Task Starvation and Resources Problem Using Optimized SMPIA in Cloud

In this study, a new feature is added to the smart message passing interface (SMPI) approach (SMPIA) based on the prioritization method, which can completely eliminate the task starvation and lack of sufficient resources problems through prioritizing the tasks. The proposed approach is based on prioritizing the tasks and the urgency of implementation. Tasks are prioritized based on execution time, workload, the task with a more sensitive priority is executed earlier by the free source. The idea of demand-bound functions (DBFs) was extended to the SMPIA setting based on partitions and caps. For each task, two DBFs are constructed, DBFLO and DBFHI, for the LO and HI criticality modes, respectively. The simulation results returned by MATLAB showed that with the optimized SMPIA (O-SMPIA), the parameters of maximum service execution time, response time, delay time, and throughput improved in this work. In addition, the results confirmed that the reduction of execution time, completion time, and resource consumption time did not affect the response time and throughput of workflow tasks and did not cause inefficient use of resources in virtual machines (VMs) and data centers (DCs). The evaluation of performance metrics showed that the delay, response time of the Greedy algorithm was less than that of Max-Min and Min-Min. At the same time, the execution time of Max-Min was less than the others and the throughput of the Greedy was longer. The effect and throughput of O-SMPIA became more obvious as change to the job count and the number of cloud workloads increased. It is also worth mentioning that one of the main advantages of the O-SMPIA to other methods is the efficient use of time to execute all the defined tasks by CPU.


Introduction
Cloud computing (CC) delivers computing resources like software and hardware as a service to the users through a network. Due to the scale of the modern data centers (DCs) and their dynamic resources provisioning nature, efficient scheduling techniques are required to manage these resources [1,2]. The smart message passing interface (MPI) can be used as a new idea and scheduling technique in CC that In general, the optimization task starvation problems can be divided into discrete and continuous problems. The decision variables for a combinatorial problem have discrete values, while the decision variables for a continuous optimization problem can take up values within the domain of real values (Ri) [14,15]. Moreover, generating a schedule to optimize the two most important, yet conflicting, scheduling objectives (i.e., execution cost and execution make span (MS)) has become a complicated problem. For example, optimizing the execution cost increases the execution MS. This is due to the interlink that exists between these objectives. MS and cost optimization problem persist because the virtual machine (VM) selection (which is a key to managing resource utilization (RU) to improve system throughput) is usually ignored by researchers [16].
Based on the number of criteria involved in the optimization task starvation problem, this can be divided into single-criterion and multicriteria. The task of single-criterion optimization is to find the optimal solution according to only one criterion function. When the optimization starvation problem involves more than one criteria function, the task is to find one or more optimal solutions regarding each criterion. In such condition, a solution that is appropriate to one criterion can be inappropriate to another, and vice versa [17]. Therefore, the goal of multi-criteria optimization is to find a set of solutions that are optimal with respect to all other criteria. Note that, most real-world problems are of the multi-criteria type. The review of the literature shows that some research gaps such as VM selection and task mapping criteria still exist, which require further investigation [16]. The obtained results of the experiment indicate that the optimized smart MPI approach (O-SMPIA) outperform the SMPIA [18] in terms of total execution time (TET), MS and RU. Other details are listed in Tab. 1.
The aim of this work was to solve the problem of starvation of tasks and lack of sufficient resources using O-SMPIA. To this end, we combined the Greedy, Max-Min and Min-Min [18] algorithms, with the O-SMPIA. In this matter, O-SMPIA was used at the DCs level due to the increase in the use of DCs in In this paper, the earliest deadline first with virtual deadlines (EDF-VD) algorithm was applied to the utilization caps. EDF-VD was combined with O-SMPIA, which resulted in the improvement of the system performance. The O-SMPIA helpeds to shorten the execution time, response time, and delay time and increased the throughput.
The main contributions of this paper are as follows: (1) A priority-based task scheduling algorithm was developed. O-SMPIA developed in this study is based on the prioritization method that can completely solve the problem of task starvation and resource consumption. SMPIA was extended to consider tasks execution time, MS and RU as metrics for the performance evaluation and optimization goals. (2) O-SMPIA solves the starvation problem by considering the type of urgency of the execution time.
Tasks are prioritized based on the execution time; the tasks with the highest priority is executed earlier by the free source. (3) O-SMPIA also improved the system throughput, hence significantly improving the system performance. O-SMPIA helps to efficiently use the time in a way to execute all the tasks by CPU. (4) The idea of demand-bound functions (DBFs) was extended to the SMPIA setting based on partitions and caps. For each task, two DBFs, DBF LO and DBF HI , are constructed for the low-and highcriticality modes, respectively.
The motivation for this research is the implementation of workflow on the telecommunications transaction application in order to achieve proper processing time and manage cloud systems. It also aims to improve the performance of algorithms, increase the quality of service (QoS), solving the problem of task starvation and lack of sufficient resources considering the priority of tasks in MPI communications of the desired virtual networks. The need for this research is to delay MPI communication in the virtual MPI bus (VMPIB) [18]. Using O-SMPIA and SMPIA in a telecommunications transaction application increases the performance of the application.
The remainder of this paper is organized as follows: related work is presented in Section 2; cloud computing and task starvation model based on SMPIA in Section 3; the proposed optimized SMPIA method in Section 4; optimization criteria in Section 5; the performance evaluation in Section 6, and; ultimately, conclusions and future scope are presented in Section 7.

Related Work
In [18], the SMPIA aimed to improve the starvation problem for workflow tasks and lack of sufficient resources at the VM level. This problem was solved with Greedy, Max-Min, and Min-Min algorithms. These algorithms, especially Max-Min and Min-Min, solves the above-mention problems based on run times of alternative virtual machines (AVMs) and priority scheduler algorithm. The TET and MS were prioritized as well.
In [16], the multi-objective workflow optimization strategy (MOWOS) employs the tasks splitting mechanism to split large tasks into sub-tasks to reduce their scheduling length. The simulation results showed that, the MOWOS algorithm had less execution cost and, better execution MS, and also it utilized the resources better than the existing HSLJF [15] and SECURE [19] algorithms.
In [13], a hybrid of shortest-job first (SJF) and round robin (RR) is one among the foremost used and powerful hybrids for solving starvation where we will enjoy SJF performance in reducing the turnaround and from RR in reducing task waiting time. But the task quantum value was always the obstacle in having the optimum hybrid.
In [12], to solve the starvation, a hybrid shortest-longest scheduling algorithm was presented. The capabilities of each VM and the length of the task to allocate the tasks to most convenient VMs were considered. Thus, their algorithm overcame the starvation problem through considering both the provider and user's requirements. The experimental results proved that the proposed HSLJF algorithm outperformed the existing algorithms in terms of minimizing the average MS, response time, and the actual execution time of tasks, while maximizing the RU and throughput.
In [20], EDF-VD was applied to each subset or partition independently. Dividing the processor into two halves has almost no performance degradation compared to EDF-VD (with a totally dedicated processor). The use of smaller caps of a partition led to, lower performance in terms of a set of scheduled tasks. If one HI task switches to HI mode, then only those LO task within the same partition or subset will be discarded, whereas LO tasks in other partitions continue running. The main advantage over the severbased approach is that there is no starvation period, i.e., the time interval between two runs/repetitions where no service is provided to tasks within the server. In contrast to this, tasks in a partition run as long as they have not used up their assigned utilization, i.e., as long as they are below their utilization cap.
In [11], a priority-based process scheduling (PRIPSA) algorithm was presented, which was developed with the block-based queue in CC. The proposed algorithm gives the high priority to the processes supported their lead time then burst time. Therefore, it is able to solve the starvation problem. The priority-based scheduling algorithm was focused on moderating the starvation problem. For the preventive algorithm, the starvation increased linearly, while for the priority-based scheduling algorithm, the increase rate was extremely slow.
The resources are used good in case of improved genetic algorithm, but some resources are not used efficiently in case of standard genetic algorithm, there is a large gap in the percentage usage of resources [21]. The proposed cuckoo PSO (CPSO) algorithm achieves minimal deadline violation rate when compared with algorithms such as PBACO, ACO, Min-Min, and FCFS [22]. The standard PSO easily got trapped into the local optimum solution, which results in improved being premature convergence [23]. The comparison of the improved PSO algorithm (in crossover and mutation) with PSO showed that improved PSO was not only converged faster, but also it was executed in a large scale faster than the other two algorithms [24]. According to dynamic nature of cloud and deficiencies of methods presented, proposals and solutions have been suggested to improve the accuracy and the efficiency of the prediction methods [25]. The new methods should be able to consider all dimensions of the application to making the appropriate learning model. The efficient learning model could enhance the prediction accuracy. Also, suggested new approaches to develop prediction models in a way of hybrid nature [26].
In the current paper, an important feature was added to the SMPIA based on the prioritization method, which improves the system performance to an acceptable degree and leads to higher priority tasks that need to be implemented sooner.
Tab. 2 reports the details of the comparison of performance parameters of the algorithms and methods in terms of optimization.

Cloud Computing Model Based on SMPIA
The CC model is aimed at providing resources and services through a network; resource allocation and workflow scheduling are two focal points of CC [27][28][29]. Workflow as a service (WaaS) of deal and tenders from ministry of communication and information technology of Iran (MCITI) was defined as small and big transactions. All the flows of the selected WaaS were logged into the system after selecting the small or big WaaS of the waiting queue. Each DC had a number of servers, each of which had several VMs in VMPIB. VMs can have a two-way communication. As depicted in Fig. 1, all VMs were connected to VMPIB. The problem in the current work is task starvation and lack of sufficient resources in VMs and DCs level to optimize the performance parameters simultaneously. To this end, the O-SMPIA is presented in this study. To define the concept of the VMPIB and the VM connection approach, first, a topology was considered with the connected graph G = (D, V), where V = {VM 1 , VM 2 , VM 3 , …, VM m } and D = {DC 1 , DC 2 , DC 3 , …, DC d }. D and V represent the entire number of DCs and VMs, respectively. The function φ: V → D was considered to regulate the dependencies of flows and their tasks including the F = {F 1 , F 2 , F 3 , …, F f } set. Moreover, the function θ: F→ V and T = {T 1 , T 2 , T 3 , …, T t }. φ is a function, in which VM is assigned to each DC in VMPIB. θ is a function, in which the flow that could be executed by VM in VMPIB is determined. F is the total flows depending on each other. And T is defined as the total transactions [18].

Task Starvation Model Based on Optimized SMPIA
In the present article, a new feature was added to SMPIA based on the prioritization method, which can completely eliminate the problem of task starvation and resource consumption. In this matter, the starvation of tasks and resources is addressed by prioritizing the tasks as suggested in [13,20,30]. High priority tasks are considered HI and low priority tasks are considered LO in O-SMPIA to schedule the sporadic tasks that run under preemptive uniprocessor scheduling. Each job consists of task count, job size, job priority, position, type, etc. The minimum separation between any two jobs or instances of a task is denoted by T i , and we assume implicit deadlines, i.e., 8i: There is no selfsuspension, and the context-switch overheads are assumed to be negligible on different processors. According to Eq. (1), LO a task dos not in the other subsets but only those in τ x , where x, m ∈ {LO, HI}.
Finally, among the four potential criticality-to-mode combinations, note that only U LO LO , U LO HI , and U HI LO are defined. U HI LO does not exist for a particular τ x ⊂ τ, since LO tasks are dropped when an HI task in τ x changes the HI mode and, hence, do not run in the τ x 's HI mode. In the second part, we use the same system model as in previous work on the scheduling tasks on each partition. Formally, each task τi set τ = {τ 1 , …, τ m } is defined by tuple (C i (LO), C i (HI), D i , T i , L i ). For such a system to be successfully scheduled, all (nondiscarded) jobs on the same partition must always meet their deadlines. A similar concept is the supplybound function (SBF) (l), which lower-bounds the amount of supplied execution time of the platform in any time window of size l. For instance, a unit-speed, dedicated uniprocessor has SBF (l) = l. Other platforms, such as virtual servers used in hierarchical scheduling, have their own particular SBFs. SBF is of the maximum unit speed if in accordance with Eq. (2) [20,31].
Its basic idea for solving the starvation problem in this paper is to promote HI jobs in the LO mode by shortening their deadlines so as to reserve the processor capacity for the HI mode. From the above description, the LO and HI tasks need to be schedulable with their corresponding c LO i in the LO mode. Similarly, in the HI mode, the HI tasks also need to be schedulable with their corresponding c LO i . As a result, the following two schedulability conditions expressed by Eqs. (3) and (4) In the following, Algorithm 1 (Tab. 3) is presented for schedule tasks.
Although using fixed UL is more straightforward, in some cases, we would like to find an optimum value of UL for a given τ x ⊂ x such schedulability could be guaranteed. A DBF (τi , l) gives an upper bound on the maximum possible execution demand of task τi in any time interval of length l, where demand is calculated as the total amount of required execution time of jobs with their whole scheduling windows within the time interval. The idea of DBFs was extended in this article to the SMPIA setting based on partitions and caps. For each task, two DBFs are constructed, DBFLO and DBFHI, for the low and high-criticality modes, respectively. While the system is in low-criticality mode, each task τi behaves as a traditional sporadic task with parameters C i (LO), D i (LO), and T i . Note that it uses relative deadline According to Fig. 2 and also as suggested in [20,33], first of all, tasks from different parts of the system will be sent to the central processor.
The task checker interface will process all incoming tasks. Then each of the inputs LO and HI are checkeds. After diagnosis of the LO or HI tasks assigned to each tag, they are sent to the corresponding queues. Each queue, after receiving the tasks, sends them to the sorter interface. The sort by priority interface set the HI tasks on top of the table and other tasks based on priority are sorted on table. All the tasks are then entered into SMPIA, and the algorithm tasks between partitions generated by caps distribution and scheduling tasks can be done. During the parallel execution of tasks on each partition, the bounding techniques are also applied to the HI and LO tasks. Then, the sensitivity of HI is checked. At this time, there is no need to time the processing of LO tasks. One of the main advantages of the proposed method to other methods is the efficient use of time to execute all the tasks by CPU.

The Proposed Optimized SMPIA Method
According to the ideas presented in this paper for solving the problem of task starvation and lack of sufficient resources, the Algorithm 2 (Tab. 4) is proposed as follows: Therefore, one important advantages of the O-SMPIA method is that the prioritization of tasks and their scheduling using VMs in a way to solve; the starvation problem. As explained during this section, the O-SPMIA method is based on task prioritization. By prioritizing the tasks, the schedule is changed, which causes the selection of the desired VM to be done using the Greedy algorithm. The Greedy algorithm, based on calculating its variance and HI and, LO tasks, selects the acceptable VM. Therefore, the best VM is first selected according to the Max-Min or Min-Min algorithm; then, the tasks with higher priority are executed. In SMPIA and O-SMPIA, all the created processes will be submitted to one node; as a result, the first machine will have the whole load, whereas no tasks will be submitted to the other machines. The working procedures of the SMPIA and O-SMPIA are shown in Fig. 3

Set Bounding Technique On HI and LO Tasks
Transform Task To Other Parts End Figure 2: The flowchart of the proposed method according to [18] and also as suggested in [20,33]    The nfs-kernel-server and MPICH-3.0.4 packages are installed to implement MPI, thanks to a special feature in the master system. After that, the Htop software is installed to monitor the processes running in parallel in MPI. Then, in the Hosts file, the master system is defined for IP systems. The copy of the program file was compiled in the "Mirror" folder using the MPI compiler. At the end, the program file is copied to the mirror folder and compiled it using the MPI compiler. Then, the program compiled by MPICH is executed using the following command: /nfsshare$ mpirun -f hosts -n number. /MPI_sample.
It should be mentioned that instead of number, the number of processors desired to be involved with the program could be entered. Moreover, the program names are entered instead of MPI_sample. Afterwards, the program is run well on all systems and the performance of the processors is observed on each of the Slave computers with the assistance of Htop. Ultimately, the user code is copied in the AVM buffer by MPI, and then is prepared for parallel execution [18]. MPICH-3.0.4 runs using multi-purpose daemon (MPD), that is a, process manager that is able to launch the execution of a parallel program on multiple machines. Additionally, it provides communication between all launched processes to exchange data. MPD starts on all the nodes participating within the cluster. Using mpdboot, MPD launches itself on all nodes of the cluster. The MPD clusters configuration has a ring topology, where each node is connected to the next node, and therefore the last node is connected to the first one. When a parallel program calls MPI_Comm_spawn, without specifying where to run the newly spawned process through the [mpi info] parameter, the process manger MPD starts the new spawned processes on the next node in the MPD's ring. For the newly created processes, the default is that follow the RR algorithm starting from rank (0) every time.
Therefore, the creation of processes by using MPI_Comm_spawn has two scenarios: 1. The first scenario happens when using one MPI_Comm_spawn function to create many processes by using the [Maxproc] parameter. It will follow the Greedy algorithms to distribute processes starting from rank (0), and this will distribute the load equally. 2. The second scenario suffers from a significant problem when the MPI_Comm_spawn function is called many times. For example, when calling the function iteratively inside a loop, or when making recursive calls for this function, in both cases, each call of this function will submit the new process every time to the first rank. In this case, all the created processes will be submitted to one node, the first machine. As a result, the first machine will have the whole load, whereas no tasks will be submitted to the other machines. As stated above, the decision of submitting new tasks to hosts is made in the function level. In every call to this function, the submission of tasks happens according to the Greedy algorithms starting from the clusters' first host independent of any MPI_Comm_spawn previously named. This unbalanced load happens because of the decentralization in decision made by MPI_Comm_spawn about submitting the new spawned process to a host since MPI-2 does not introduce any way of scheduling [13,34].

Optimization Criteria
This section explains the metrics used to measure the effectiveness of O-SMPIA in task scheduling. Different kinds of optimization criteria were addressed, e.g., MS, cost, budget, deadline, TET, RU, throughput, load balancing, and energy efficiency. In General, these optimization criteria are categorized into two types based on cloud service: cloud users' type and cloud service provider's type [1,12,18,19].

Make Span
It is defined as the completion time of the last task that is required to complete and leave the cloud system [18,35,36]. It is calculated using Eq. (6).

Execution Time
Execution time is defined as the TET of every single task from the beginning to the end. TET of all tasks from the beginning to the end [18]. The ET is equal to execution time of task that is calculated based on Eq. (7) [12].
L indicates the length of the task which is measured with the amount of million instructions. MIPS indicates the amount of million instructions per second allocated to VMi and loadVMj indicates the previous existing tasks performed by VMj as allotted in Eq. (8), N is the number of tasks into specific VM [12,18].

Deadline
It represents the termination of running tasks at a particular time [37].

Response Time, Delay or Latency
Indicates the total time needed to respond by a cloud computing system and this metric should be minimized. It is calculated based on the Eq. (9) [12].
Response Time ¼ ðCompletion TimeÞ i À ðSubmission TimeÞ i (9) The expected completion time can be determined by calculating the expected completion time of task in each VM based on Eq. (10) [12].

Resource Utilization
It refers to making the most of the available resources and keep resources as busy as possible. It is useful for service providers to get again by leasing the finite resources to the cloud user on-demand [18,38].

Throughput
It measures the number of completed tasks per unit time [11,12,39,40]. The throughput is calculated through the following Eqs. (11), n is total number of tasks.

Load Balancing
This criterion in CC refers to the distributions of loads among VMs over physical resources. Many techniques have been introduced by the authors in [41][42][43].

The Performance Evaluation
The O-SMPIA was simulated to optimize the system performance to achieve the least response time, delay time, and execution time as well as the highest throughput, and also to solve the problem of insufficient resources. In addition, the proposed O-SMPIA was compared with SMPIA. With O-SMPIA, the performance of the system was significantly improved, which is discussed in this section. The tests were conducted on the actual data in a homogenous environment including 4 DCs, 22 servers, 132 VMs, 132 flows, and 324 telecommunication equipment. To do so, 201535 records were collected on transactions (deals and tenders) of the telephone company from 2011 to 2017. We took average of 50 simulation results for each graph. The simulation and implementation were performed in MATLAB. The experiments were done on a system with the following features: Windows 10, CPU 2.5 GHz, core i7-2450 m, and 16 GB RAM. In this research, data sets of telecommunication transactions applications (deals and tenders) in MCITI and workflows were used for simulation purposes [18]. In this section, the results of O-SMPIA, including throughput, response time, delay time and execution time are presented based on three Min-Min, Max-Min, and Greedy algorithms. This section explains the metrics used to measure the effectiveness of SMPIA and O-SMPIA in task scheduling. Different kinds of optimization criteria were addressed, e.g., MS, cost, budget, deadline, TET, RU, throughput, load balancing, and energy efficiency. General, these optimization criteria are categorized into two types based on cloud service: cloud users' type and cloud service provider's type [1,12,18,19]. The details of simulation by data values parameters are shown in Tab. 5.
6.1 SMPIA Performance 6.1.1 Delay Time  Fig. 4a, shows the delay time of SMPIA based on the number of iteration. As can be seen in this figure, the delay time in Max-Min_SMPIA based on the number of iteration is better than that of the other algorithms, and also it is faster. Therefore, the delay time for workflow in the CC systems using Max-Min_SMPIA was found more efficient than the other methods. Fig. 4b, shows the load balancing of SMPIA based on iteration. As this figure displays, the load balancing of Max-Min_SMPIA based on the iteration is best and more efficient than that of the other methods.

Load Balancing
6.2 Optimized SMPIA Performance 6.2.1 Throughput Fig. 5a, shows the throughput of the O-SMPIA algorithms. Throughput counts the number of job count completed within the task in the per unit time. As can be seen in the figure, the throughput in O-SMPIA based on the Greedy algorithm is better than that of the other algorithms, and also has a better performance. In all three methods, throughput gradually increased. Therefore, the throughput rate for workflow in CC systems using O-SMPIA based on the Greedy algorithm was found more optimal. Fig. 5b, shows the response time of the O-SMPIA algorithms. As can be seen in this figure, the response time in O-SMPIA based on the Greedy algorithm is better than that of the other of other algorithms, and it is faster. Fig. 5c, shows that the execution time in O-SMPIA based on the Max-Min algorithm is better than that of the other algorithms, and it is faster. Therefore, the execution time for workflow in CC systems using intelligent MPI based on the Max-Min algorithm was found more efficient than the other methods. As demonstrated by this figure, the delay time in O-SMPIA based on the Greedy algorithm is better than that of the other algorithms, and it is faster. Therefore, the delay time for workflow in CC systems using Greedy_O-SMPIA was shown to be more efficient than the other methods.

Execution Time
The evaluation of performance metrics showed that the delay and response time of the Greedy algorithm were less than those of the others. At the same time, the execution time of Max-Min was less than those of the others and the throughput of the Greedy was longer.

Conclusions and Future Scope
In this work, O-SMPIA was presented, which was designed based on task prioritization. O-SMPIA succeeded in alleviating the problem of task starvation and lack of resource by taking into account the urgency of execution time. It also increased the throughput of the system. The HI tasks were assigned to the best available selected resource. In this paper, the EDF-VD algorithm was applied to a fixed version of utilization caps. The EDF-VD algorithm was combined with O-SMPIA, which caused the improvement of the system performance. O-SMPIA shortened the execution time, response time, and delay time and, at the same time, it increased the throughput. O-SMPIA was simulated in the MATLAB The best mode was greedy, in which with O-SMPIA, delay, response time, and throughput were improved. The results with actual data of the environment showed that O-SMPIA was more efficient than SMPIA in the cloud system. Comparing the performance metrics showed that the Greedy algorithm was better than the Max-Min and, Min-Min algorithms in terms of delay, throughput and response time. On the other hand, Max-Min was found better than Greedy and Min-Min in terms of execution time. These results reflect the effectiveness of O-SMPIA on the cloud resource consumption time and resource efficiency. The simulation results show that O-SMPIA was able to increase the system throughput by quickly selecting VMs and responding to the assignment of tasks to choose VMs during a timely manner. The effect and throughput of O-SMPIA is more obvious as change within the job count and the number of cloud workloads increase. One among the most advantages of the O-SMPIA to other methods is efficient use of your time to execute all the tasks by CPU. In the future, it is recommended to analyze multiple modes (prerequisite and post-requisite) and the quality of SMPIA and O-SMPIA with scientific and artificial workflows.
Funding Statement: This Research was supported by Islamic Azad University-Sari Branch, Sari, Iran.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.