Workflow management technologies have been dramatically improving their deployment architectures and systems along with the evolution and proliferation of cloud distributed computing environments. Especially, such cloud computing environments ought to be providing a suitable distributed computing paradigm to deploy very large-scale workflow processes and applications with scalable on-demand services. In this paper, we focus on the distribution paradigm and its deployment formalism for such very large-scale workflow applications being deployed and enacted across the multiple and heterogeneous cloud computing environments. We propose a formal approach to vertically as well as horizontally fragment very large-scale workflow processes and their applications and to deploy the workflow process and application fragments over three types of cloud deployment models and architectures. To concretize the formal approach, we firstly devise a series of operational situations fragmenting into cloud workflow process and application components and deploying onto three different types of cloud deployment models and architectures. These concrete approaches are called the deployment-driven fragmentation mechanism to be applied to such very large-scale workflow process and applications as an implementing component for cloud workflow management systems. Finally, we strongly believe that our approach with the fragmentation formalisms becomes a theoretical basis of designing and implementing very large-scale and maximally distributed workflow processes and applications to be deployed on cloud deployment models and architectural computing environments as well.
Workflow management technology is used to model, automate, monitor, and optimize repetitive tasks within a variety of workflow processes, including business processes and scientific workflow processes. As the proliferation of distributed computing systems and the service-oriented architecture (SOA) has progressed rapidly, the field of workflow management has encountered a new challenge: The transition from conventional workflow management technology into this new state of the art technology. In particular, the need for scalability and on-demand services for large-scale workflow processes has been emphasized in traditional workflow management environments. In this regard, cloud computing deserves to be considered as the most well-suited computing environment for managing large-scale workflow processes, particularly for scientific workflows. Accordingly, many researchers and practitioners have begun to recognize the potential of the cloud workflow management system (CWMS), which runs on a cloud computing environment, and the need to further develop its functionalities needed.
A CWMS is capable of enacting complex workflow processes requiring massive computing resources distributed across multiple clouds. Therefore, technical efforts for scalability are not required because computing infrastructure services are provided by third-party providers (e.g., Amazon, Microsoft, and IBM), and one can easily access their services at any time. The availability of on-demand-service is another advantage of the CWMS. End users with different goals (e.g., process designers and process performers) can access the CWMS and immediately request and receive the services they require.
In this paper, we focus on distributing workflow applications across multiple cloud environments. Workflow applications, which are implementations of tasks (or activities) comprising workflow processes, are characterized by their heterogeneity and demand for substantial computing resources. For instance, a scientific workflow process, which is a type of large-scale workflow process, generally consists of a set of data-intensive tasks implemented as diverse and massive applications for scientific experiments. Therefore, the effective distribution of workflow applications across different cloud environments would lead to greater management and maintenance capabilities for large-scale workflow processes. To address this issue, we propose an approach to workflow application fragmentations based on cloud deployment models to satisfy high-level requirements related to the execution of cloud workflow processes (e.g., the security issue). We also present formalisms for describing cloud workflow models and workflow application fragmentations based on deployment models. Concretely, the formalisms we propose are based on the information control net theory [
The remainder of this paper is organized as follows. In Section 2, we detail the need for workflow application fragmentations and describe how workflow applications can be fragmented based on cloud deployment models. In Section 3, we provide some formal definitions and describe our proposed algorithm for workflow application fragmentations. In Section 4, we introduce a preliminary architecture for a cloud workflow engine supporting our proposed approach. In Section 5, a summary of related works will be presented. Finally, we conclude this paper and discuss future work in Section 6.
Over the past few years, we have witnessed an era of remarkable growth in the field of cloud computing and its applications. Although there is no doubt that cloud computing is the most attractive option for building enterprise information systems, there are still issues that must be addressed carefully. One of the key issues in cloud computing is deployment models. It is important to choose a method for deploying cloud services based on the preferences of an organization. IT policy, business characteristics, the required level of data governance, elasticity, and security should be taken into account when selecting a deployment model. Regarding cloud deployment, there are four major types of deployment models: private, community, public, and hybrid. Details of each deployment model are summarized in
Deployment type | Description | Managed by | Advantages | Challenges | |
---|---|---|---|---|---|
Private cloud | The cloud infrastructure and services are exclusively provisioned for a single organization and managed by the organization. | Organization | High utilization of existing in-house resourcesFull control over security/policy concerns | Required in-house skill and staffsHigh risk of ownership | |
Community cloud | The cloud infrastructure and services are exclusively provisioned for a specific community of organizations that have common objectives. | Organization (community member) or third-party provider | Support for large inter-organizational projects | Interoperability and compliance issues | |
Public cloud | The cloud infrastructure and services are provisioned for open use by the general public. | Third-party provider | Faster deployments | Elasticity and flexibility | |
Hybrid cloud | The cloud infrastructure and services are configured as a combination of two or more deployment types (private, community, or public). | Organization or third-party provider | High elasticityCustomizability | Most complicated configurationStandardization, interoperability, and compliance issues |
The private cloud model is characterized by being built and provided for only a single organization. That organization has ownership of the management and operations of the cloud infrastructure and services. However, this means that most of the responsibility for the availability of services and data privacy also belongs to that organization. In terms of computing resource utility, a private cloud may be well-suited for organizations that already own data centers and developed IT infrastructure and want to reutilize existing in-house computing resources. On the other extreme, a public cloud can be accessed by any customers with different needs and concerns. Organizations using a public cloud may benefit from the advantages of the offerings provided by cloud providers, such as scalability and reliability of services, as well as fast deployment. Because of these benefits, the public cloud model is the dominant deployment model in cloud computing. However, there is the potential for failure of service-level agreements (SLAs) because the organization does not claim full control of the cloud infrastructure. A community cloud infrastructure is constructed to support a specific inter-organizational project. The community consists of member organizations that have shared concerns or objectives, and their cloud is managed by the community (single or multiple members) or a third-party organization. The core function of a community cloud is to accelerate cooperation between member organizations and the integration of resources for the project. Although there are excellent examples of community clouds, such as the European organization for nuclear research, known as CERN, the challenges of interoperability and compliance still exist. Finally, a hybrid cloud is a combination of two or more deployment models (private, community, and public). Organizations apply this model to leverage the advantages of multiple models by the outsourcing low-priority activities (or high-computation tasks) to a public cloud and controlling core activities through a private cloud [
Historically, workflow management technology and systems are known to have originated from efforts in the field of office automation in the mid-80s. Technological maturation was achieved in the early 90s [
Workflow applications in business domains: BPM technology, as a successor to workflow technology, has been actively employed in many industries, including the financial, e-government, and manufacturing industries. In particular, as the SOA concept has become a key role in the development of enterprise information systems in recent years, most industries are leveraging these technologies for orchestrating and enacting web services (e.g., WS-BPEL). In this context, the migration issue for traditional workflow applications has arisen (i.e., the transformation from legacy systems to web services). Consequently, many enterprises have begun focusing on exploiting the potential of cloud computing for migrating workflow applications and accelerating their business process enactments. For example, traditional workflow applications in the banking industry are typically composed of numerous program functions and implemented on legacy systems. In this situation, Reference [
Workflow applications in scientific domains: Cloud infrastructures are more crucial in scientific workflows compared to the BPM domain. In fact, several scientific experiments require substantial data-intensive tasks and encompass a wide variety of workflow applications that perform scientific tasks [
In summary, the cloud workflow has proven useful in automating and managing large-scale workflows for business and scientific experiments to achieve competitive advantages by utilizing cloud computing (e.g., scalability and alleviation of concerns regarding maintenance). In this context, we argue that a proper method to fragment and deploy workflow applications in different cloud environments should be considered during the stages of designing and implementing a CWMS. The term “workflow application” refers to an invoked application, which includes program logic for the execution of one or more tasks within a workflow process. Workflow applications have historically been implemented in the form of legacy systems or monolithic programs that are not strongly integrated with a workflow management system. For the numerous workflow applications operating in heterogeneous environments, standardization and maintenance issues became increasingly difficult and led to the requirement for expensive modernization efforts. As web service-based technology has emerged, coarse-grained workflow applications have been decomposed and reconstructed for the granularity of service to fit modern business processes or other business needs. In this context, we introduce a new approach to fragment and distribute a group of workflow applications based on cloud deployment models. This approach facilitates the intentional configuration of workflow application distributions to satisfy high-level requirements for operating a CWMS (e.g., privacy concerns for enacting workflow processes).
Assuming that an organization exploits a cloud deployment model when designing and implementing a CWMS, the operational environment will vary to a large extent upon the selected deployment model.
Private-model-based environment: The environment for a cloud workflow management system consists of three sublayers: the cloud environment layer (CEL), workflow process layer (WPL), and workflow management system layer (WMSL), as shown in
Community-model-based environment: The community cloud deployment model assumes that many organizations participate in a community with a common goal and have expectations for co-ownership of the cloud. Therefore, in this model, each organization independently manages its own WMSL to provide the system functionalities for performing shared workflow processes (i.e., inter-organizational processes). As you can see in
Public-model-based environment: According to the characteristics of the public model, all workflow applications are openly exposed and shared with all organizations accessing a public cloud. Organizations, such as research institutes, enterprises, and government agencies, that independently manage WMSLs and WPLs, are permitted to freely utilize workflow applications provided by a public cloud managed by a third-party provider, as shown in
Hybrid-model-based environment: The hybrid model, which includes multiple deployment models, is an appropriate model for organizations that simultaneously operate inter- and intra-organizational workflow processes. Additionally, this model is well-suited to environments in which cloud bursting [
As above, we describe different cloud workflow environments based on their type of cloud deployment model. For the sake of clarity, we presented simple examples, in which a single deployment model was applied to, and assumed a process-wise deployment strategy. However, the ultimate operational environment for cloud workflows that we pursue in this study is geared toward multi-cloud environments with activity-wise deployments. In other words, it is more natural and practical to apply a deployment model to each activity so that workflow applications invoked from each activity will be deployed to different cloud environments, rather than deploying the entire workflow applications to a specific cloud. This feature enables more sophisticated configurations for the enactments of cloud workflows and related workflow application distributions.
In this section, we formally describe the cloud workflow model that represents the theoretical basis for this work. We then describe the steps for workflow application fragmentations based on cloud deployment models. Additionally, the information control net (ICN) theory, presented in [
A workflow process is described by incorporating information from a few primary aspects: control-flow, data, organization, and resources. Based on this convention, we add the cloud aspect to the ICN methodology to describe cloud workflow processes.
Cloud workflow process: Defined by a predefined set of activities (or tasks) and their precedence/succession relationships. A CWMS has functionalities for modeling, enacting, and controlling defined workflow processes in cloud environments.
Activity: An entity type that represents the basic unit of work comprising a cloud workflow process. Each activity has not only precedence relationships with other activities but also association relationships with the entity types of several aspects of the cloud workflow, e.g., transition condition, relevant data, role, workflow application, and deployment type.
Role: A logical unit of the organizational structure. It is concerned with duties, skills, and authorities required for the execution of a particular activity. In addition, each role can take responsibility for multiple activities.
Actor: A person capable of performing an activity through the associated role. Each actor can participate in multiple roles.
Relevant data: Input/output data objects used to perform an activity. They can also be embedded in transition conditions corresponding to disjunctive patterns (i.e., OR-split and XOR-split) for making decisions in a workflow process enactment.
Workflow application: A software program that is invoked during the phase of execution of an activity and automatically processes related computational tasks.
Deployment type: Refers to the cloud environment responsible for executing a particular activity. Workflow applications associated with the execution of the activity will be deployed on different cloud environments based on this specified information.
As described above, a completely defined ICN-based cloud workflow model contains workflow entities for various aspects and the relationships between those entities. In this paper, we focus on the cloud aspect, particularly focusing on fragmenting workflow applications based on the deployment types assigned to activities. The following is a definition of a partial workflow model (PWM), which serves as the cloud-deployment-oriented portion of the cloud workflow model:
The model contains eight activities (
In order to formally describe the step of workflow application fragmentations, we define the workflow application fragment model that is a transformation result from a partial workflow model by a fragmentation algorithm.
A workflow application fragment model is a network model in which the cloud deployment type that acquires each workflow application fragment is a nodal type. The flow relationships between these deployment types are formed through the execution of the cloud workflow process in the example model. The following represents details of the proposed algorithm for the theoretical fragmentation of workflow applications in a cloud workflow model. Specifically, the algorithm transforms a PWM (Definition 1) to a workflow application fragment model (Definition 2) by linking deployment types and workflow applications based on the relationships with the corresponding activities.
The fragment model generated by the fragmentation algorithm is graphically represented in
A formal representation of the generated fragment model is provided in
Thus far, we have discussed the concept of cloud deployment models and formalisms of workflow application fragmentations through these models. To put our approach into practice, we now describe the conceptual architecture of a cloud workflow engine while accounting for the execution of cloud workflow processes using application fragmentation based on deployment models. The conceptual architecture includes client components, the workflow engine, and cloud environments. An overview of this architecture is illustrated in
The architecture includes client components, such as the workflow process modeler, runtime clients, and monitoring and analysis clients, and they have graphical user interfaces for end-users. They communicate with the workflow engine that provides the functionality supporting cloud workflow management actions. The workflow engine is the core component of the CWMSs and it consists of two major parts: The modeling and deployment part, and the enactment part.
Modeling and deployment part: The components in this part are responsible for assisting in the various phases of modeling and deployment of cloud workflow processes. First, a workflow process designer performs process modeling using various information, such as model definitions, organizational information, and relevant data provided by the modeling data agent. In this step, the information about the deployment types corresponding to each activity should be specified. After the completion of the process modeling step, the designed model is submitted to the workflow engine for deployment. Using the process model parser, the model is transformed into an executable model (e.g., WS-BPEL) and the process model verifier determines if syntax errors exist within the transformed model. Finally, based on the specified deployment type information, the workflow application manager automatically deploys workflow applications on different cloud environments through communications with application service brokers. The overall procedure for process model deployment is depicted in
Enactment part: A deployed process model can be instantiated by a human worker who has the authority to create a process instance or by triggering an event with a specific condition for starting the process. The statuses of all activated process instances are managed by the process instance manager. The enactment scheduler builds concrete plans that specify execution orders for active instances with consideration for the current statuses of cloud environments. The work item handler requests the application service broker to execute a task within the active process instance and the broker determines which cloud should perform the execution based on deployed workflow applications. As an operational example,
To apply our approach to CWMSs, we presented a preliminary system architecture, largely focused on the workflow engine. According to this architecture, the step of workflow application fragmentation is performed when deploying a process model via communication between the workflow application manager and application service brokers that are responsible for a group of homogeneous cloud environments. Although the presented architecture requires additional elaboration to facilitate the implementation of a CWMS, it suggests that our deployment-based fragmentation approach is viable and can potentially aid in the flexible distribution of large-scale applications for cloud workflow processes.
The convergence of workflow management systems with distributed computing paradigms and their potential synergies have long been discussed as a major research topic. Beyond past paradigms, such as cluster computing and grid computing [
Regarding cloud workflow management, its fundamental characteristics were investigated in [
A few general issues associated with cloud computing have also been examined in conjunction with cloud workflow management. To achieve cost-effective computation, Reference [
The main topic that we concentrated on in this paper was the fragmentation and distribution of workflow applications for cloud workflow processes. References [
Regarding the fragmentation issue, Reference [
In contrast to other studies on the fragmentation issue, our work sets a scope for fragmentation at the activity level so that a group of workflow applications will be partitioned and distributed based on the deployment types attached to each activity. Therefore, the deployment-type-based approach we are pursuing will enable us to configure the sophisticated deployment of workflow applications to satisfy high-level requirements at the stages of designing and implementing a CWMS.
In CWMSs, it is crucial to find an appropriate method of distributing the executions of large-scale workflow processes to accomplish the objectives (e.g., scalability and performance of process executions) that are expected when applying cloud computing to the workflow management.
We have presented an approach to workflow application fragmentations based on cloud deployment models. To conceptualize our approach, a series of deployment-model-based operational environments for cloud workflow management was described. For the sake of clarity, we also presented formal descriptions for the workflow application fragmentation of an ICN-based cloud workflow model by specifying activity-wise deployment type information. Through our fragmentation algorithm and the resulting fragment model, we formally verified our approach.
In summary, the main advantages of our approach are as follows: First, the capability to satisfy high-level requirements is the major merit of our approach. Activity-wise deployment allows us to distribute workflow applications with an adequate reflection of desired high-level requirements (e.g., security issues and aligning business policy with IT policy). Second, we can achieve higher degrees of scalability and flexibility in the execution of cloud workflow processes by applying different deployment models. For example, cloud bursting, in which a CWMS outsources computation-intensive tasks within a private cloud to a public cloud, can be enabled by applying the hybrid deployment model. We believe that our approach contributes to the foundation of the design and implementation of CWMSs for large-scale workflow processes executed on multiple clouds. In the future, we will concretize our fragmentation concept to embody a CWMS based on deployment-model-based fragmentation.
The authors would like to thank the support of Contents Convergence Software Research Institute and the support of National Research Foundation of Korea.