Cyber-Physical Systems (CPS) comprise interactive computation, networking, and physical processes. The integrative environment of CPS enables the smart systems to be aware of the surrounding physical world. Smart systems, such as smart health care systems, smart homes, smart transportation, and smart cities, are made up of complex and dynamic CPS. The components integration development approach should be based on the divide and conquer theory. This way multiple interactive components can reduce the development complexity in CPS. As reusability enhances efficiency and consistency in CPS, encapsulation of component functionalities and a well-designed user interface is vital for the better end-user's Quality of Experience (QoE). Thus, incorrect interaction of interfaces in the cyber-physical system causes system failures. Usually, interface failures occur due to false, and ambiguous requirements analysis and specification. Therefore, to resolve this issue semantic analysis is required for different stakeholders’ viewpoint analysis during requirement specification and components analysis. This work proposes a framework to improve the CPS component integration process, starting from requirement specification to prioritization of components for configurable. For semantic analysis and assessing the reusability of specifications, the framework uses text mining and case-based reasoning techniques. The framework has been tested experimentally, and the results show a significant reduction in ambiguity, redundancy, and irrelevancy, as well as increasing accuracy of interface interactions, component selection, and higher user satisfaction.
Software development is a complex activity that is human and knowledge-intensive [
In the software engineering literature, the component-based software development (CBSD) approach has got the attention of industry and researchers in the last decade, which focuses on the composition of components as elementary blocks for product development [
The complexity, redundancy, and conflicts in the requirements engineering phases of CPS components can result in the error propagation in all subsequent phases and increasing development time and cost. Requirements of any system are rooted in the needs of stakeholders’ viewpoints, business needs, and operating environments of end-users. The requirements management phase is the important phase in the software development process, as requirements of systems are defined, documented, and maintained in this phase using natural language [
The natural language raises the issues of redundancy, conflicts, and ambiguity, which may result in system failure at an early stage. These failures may trigger the server downfall at the time of configuration in CPS systems. Therefore, the requirement management process becomes more complicated during component specification and prioritization activities. In CBSD, proper and correct integration of components play a significant role, whereas, neglecting components, improper integration, and wrong interaction behavior of components, particularly in complex and configurable systems results in system failure [
CPS specification (CPSS) needs component configuration for the implementation during functionality changes. Consequently, after configuration, interactions of reused and new components become greatly difficult because of term mismatches as components are developed by various third parties aiming for different environments and expectations. CPS prioritization (CPSP) is the process of managing relative dependencies amongst components of CPS behaviors to cope with different functionality configurations within limited resources in complex projects. CPSP plays an important role in requirement management activities, particularly for critical tasks like requirements analysis and release [
Several approaches can assist with CPSS and CPSP specifications based on stakeholder needs (cost, time, nature of the project, etc.). Most of these techniques are complex and may increase conflicts and redundancy by adopting different processes for CPSS and CPSP. The CPS development focuses on the integration of accurate and complete components to support the adoption of existing component's interfaces. Consequently, existing techniques fail during the component integration phase due to improper CPSS and CPSP activities. In CPS, multi-users involvement having diverse perspectives and importance of components according to their needs results in misinterpretation and missing semantic information of multiple stakeholders’ viewpoints, requiring more efforts during CPSS and CPSP. Whereas selection of desirable components according to stakeholders’ needs require more human interaction and effort, resulting in system crash and resource shortage.
Therefore, there is a need for desirable system CPSS and CPSP processes in configurable CPS-based software where several interactions involve amongst components for higher satisfaction and reliability. For semantic, conflict, and redundancy analysis during the specification stage text mining (TM) technique could be useful. The technique of extracting interesting and non-trivial patterns or information from unstructured text documents is known as text data mining or knowledge discovery from textual databases [
Our main contributions in this paper are as follows:
Firstly, we present semantic-based CPS specification and prioritization framework using text mining and case-based reasoning, based on diverse users’ viewpoints, managing reusability, and limited user involvement. Secondly, text mining is used for resolving ambiguous and conflicting requirements issues by extracting diverse stakeholder viewpoints semantically during configurable CPS specifications. We used two criteria current user priority and previous user ranking to prioritize CPS components after extraction of requirements semantically. CBR technique was used to extract previous similar used components ranking with less user's involvement to reduce stakeholders’ conflicts. Therefore, the proposed framework resolves the drawback of configurable CPSS and CPSP processes using text mining and CBR. Thirdly, the framework is evaluated using experiment and analysis of the results highlight that this framework reduced ambiguity, and redundancy with higher satisfaction level to deal multi-viewpoints of stakeholders semantically and identifies reused requirements during CPS software development. Finally, the study offers a guide, a baseline, and empirical evidence for future research in the domain of continuous configuration management in CPS.
The rest of the paper is structured as follows: The second segment addresses similar work to illustrate current issues. The third section focuses on resolving the problems that have been found. In Section 4, we show the findings and analyze the suggested structure in conjunction with the results. Section 5 outlines the conclusion and future work of the report.
Recently, CBSD is considered a more generalized approach for the CPS software development. To ensure the CPS quality, semantic-based specification and prioritization of configurable components are necessary. For high quality of CPS, someone must develop efficiently and effectively in the CBSD paradigm. For reliability of component requirements, most existing techniques focus on the post-integration phase of components. Still, few studies discuss specification and prioritization of components in the context of CPS development. The authors in [
The development process of the CPS is complex, and handling these complexities during the requirement engineering phase is a critical task. The [
To realize high-quality, CPSs considering technological and service features are also important during the development process. Since such systems are complex and redundant, requirements for dynamic configuration to CPSS in RE for the product and service components are a significant issue. The author of [
In the literature, there are few techniques for prioritization that satisfied specific quality criteria such as efficiency, scalability, and ease of use. The [
To identify term mismatch and semantic analysis, text mining methods are employed where Latent Semantic Index/Analysis (LSI/LSA) and Latent Dirichlet Allocation (LDA) concepts were implemented. For the independent review and audit of CPSS and CPSP requirements, the text mining approach was used to reduce quality assurance effort [
Based on existing literature, we concluded that integrating CPS components into the CBSD process is a difficult and error-prone task. This is due to the lack of semantics and term mismatch problems resulting from the diverse views of multi-stakeholders throughout the definition of component specifications. This has an impact on all phases of the CPS development process, especially component prioritization activities. As a result, after a change, it increases uncertainty, human interaction, inconsistency, and ambiguity in configurable CPSs. Therefore, this paper propose a framework for improving configurable component specification and prioritization activities using text mining and CBR for semantic and term mismatch, component ranking, and ranking predictions for similar cases problem of diverse stakeholders.
The configurable CPS requirement management process fails due to conflict, redundancy, and irrelevancy in requirements specification and prioritization which negatively impact other phases of CPS development. Therefore, this section proposes the RMCPS framework for CPS components requirements specification and prioritization using text mining and CBR techniques. The RMCPS framework provides comprehensive steps for configurable CPSS and CPSP for developing CPS based on semantic analysis, reusability of requirements and priority identification, and conflict removal for completeness.
The RMCPS framework considers diverse stakeholders’ perspectives, less human interaction, reusability of requirements, and ranking of similar components of CPS for current CPS components and predicts missing ranking of selected components to resolve issues after configuration in complex configurable CPSs. Therefore, the RMCPS framework consists of three main steps i.e., requirement elicitation and analysis (REA), reusability manager, and prioritization as shown in
In the REA phase, requirements are gathered using a web-based application from collaborating stakeholders and then collected we analyze requirements for the business case, system case, and conflict case for configurable CPS systems.
We categorize BCR requirements of CPS based on objectives, scope, benefits, performance, risks, roles, cost, resources, and rationales of the system. It helps to generate missing and incomplete requirements which are not collected during collaboration.
In the SCR list, requirements relevant to response time, the volume of data, security, performance, usability, etc. are identified. These requirements of CPS may be conflicting and thus need to be managed carefully.
In CCR we focus the analysis on commonality and conflicts in requirements, thus leading to requirements merging and removal.
In the RM phase, requirements are structured, and someone identifies semantically reusable requirements for specification and prioritization of requirements. RM comprises two processes i.e., semantic analysis and query matching.
In the SM process, we extract requirements from artifacts along with the priority of stakeholders of CPS using the RStudio tool (The tool automatically extracts terms semantically within and among all documents) for text mining (TM). TM is used to automatically analyze semantic information from the text in the form of terms based on the concept and their relationship. The K-nearest TM method is used for term extraction based on their frequencies. The following steps are used for TM [ Information extraction from CPS component specification terms. Eliminate stop words, prepositions, all repeated words, punctuation marks, etc. Remove plural into singular; removing “ing” from words, and words of similar context to find terms of CPS features. Extract CPS functionalities semantically to avoid inconsistency and incompleteness.
In the QM phase, all terms are indexed according to their frequency and search. Each CPS component using CBR for reusability of features and their priority for relevant and similar requirements to improve the prioritization process and reduce stakeholder involvement. CBR is attractive as it offers continuity and improves transparency with gained experience. CBR works on the reuse perception of a previous similar solution for requirement ranking to rank new CPS features and store ranking for future use in the central database [ Retrieve components with similar functionalities: in this step, we match previous similar components with similar functionalities, and their ranking is saved in a repository with current functionalities using expert knowledge. Components adoption: In this phase, similar components are selected which match current components based on their previous ranking information. Reuse ranking: In this step, we reuse the previous ranking of stakeholders for similar components during components interaction in the integration process.
This is used to identify the ranking of missing current CPS components ranking to reduce human interaction and redundancy. During the elicitation process, some of the stakeholders are not directly involved and they use requirements after the completion of the development process.
In this phase, both current and previous priorities are merged to identify missing functionalities of some CPS components. This results in a new priority of semantically analyzed features of CPS components. It reduces incompleteness, inconsistency, conflicts, and ambiguity in feature priorities, due to less involvement of stakeholders. After this step, a list of priority of stakeholder components interaction prioritization is established, which is later sorted with a higher ranking of components. The higher priority components implementation for integration of desirable components facilitates stakeholders.
In the next section, based on these factors, we elaborate results of an empirical study with quantitative analysis, and this study verified that RMCPS enhanced requirements management activities by using CBR and text mining techniques.
In this section, we describe the results of the experiment performed to validate the activeness of RMCPS. In our experiment, we selected two projects i.e., Car Security Alarm and Patient record system of real-world software technologies company as case studies. The company used different methods for specification and prioritization to achieve higher user satisfaction and productivity requiring extensive human interaction. The consent of the participants for the evaluation of the proposed framework is acquired after the approval from the ethics committee of the selected organization using email and agreed to follow the organization's privacy policy about sharing the information about case studies and participants. Therefore, the evaluation design process, according to participants’ knowledge and experience relevant to selected case studies. The participants of the said organization agreed to implement RMCPS to investigate user satisfaction and quality of product with proper CPSS and CPSP activities. We selected 12 participants and divided them equally into two groups, i.e., Experiment Treatment Group (ETG) and Control Treatment Group (CTG). The ETG group developed both projects using RMCPS, and CTG group adopted a traditional method for the development of both projects.
The participants included requirement analyst (RA), project manager (PM), Stakeholder (Sr), team leaders (TL), developers (Ds), and quality analyst (QA). After completing the project, we analyzed the progress based on some parameters which were identified from the existing literature for improving CPS component-based CPSS and CPSP i.e., easy to adopt (EA), component identification and retrieval (CIR), term mismatch resolves (TMR), semantic analysis (SA), increase productivity (IP), formal specification (FS), reduced human interaction (RHI), components prioritization process (CPP), prioritize desirable components (PDC), enhance components integration (ECI), proactive to changes (PTC), remove requirements conflict (RRC), remove requirements redundancy (RRR), increase process accuracy (IPC), increased completeness of requirements (ICR) and increased user satisfaction (IUS). Additionally, in the study we addressed the following research questions (RQs):
RQ 1: What is the effect of semantic-based requirement specification and prioritization on the outcome of the components integration process? RQ 2: Does the implementation of RMCPS is produced better results than other relevant methodologies? RQ 3: Can effectiveness of RMCPS improve the accuracy of the component's integration process.
To answer RQs, we experimented to extract parametric-based satisfaction levels. Therefore, the experiment starts with the first step of gathering the requirements of CPS-based data sets from the participants of experiments and map them with mentioned requirements. Then these requirements were divided as BCR, SCR, and CCR and documented. Then we used documents with their complete constraints and stakeholder viewpoints for semantic analysis using the RStudio tool. For example, in the case of Car security alarm, we extracted some of the terms after the text mining process which is listed in
S. No. | Requirements ID | Components functionalities |
---|---|---|
1 | R-1 | Door lock |
2 | R-2 | Door unlock |
3 | R-3 | Blink light |
4 | R-4 | Activate alarm |
5 | R-5 | Light blink |
After extracting components functionalities, we extracted their current and previous ranking, as listed in
S. No. | Requirements ID | Current ranking | Previous ranking | New ranking | Sorting |
---|---|---|---|---|---|
1 | R-1 | 5 | 4 | 5 | R-1 |
2 | R-2 | – | 4 | 4 | R-5 |
3 | R-3 | 0 | 3 | 3 | R-4 |
4 | R-4 | 1 | 3.5 | 3.5 | R-2 |
5 | R-5 | 4 | 4.5 | 4.5 | R-3 |
S. No. | RMCPS sorting | New ranking | Traditional method sorting | New ranking |
---|---|---|---|---|
1 | R-1 | 5 | R-2 | 5 |
2 | R-5 | 4 | R-3 | 4 |
3 | R-4 | 3 | R-1 | 3 |
4 | R-2 | 3.5 | R-5 | 2 |
5 | R-3 | 4.5 | R-4 | 1 |
Parameters | HS | S | N | DS | HD | |||||
---|---|---|---|---|---|---|---|---|---|---|
ETG | CTG | ETG | CTG | ETG | CTG | ETG | CTG | ETG | CTG | |
EA | 44 | 1 | 53 | 40 | 2 | 43 | 2 | 44 | 0 | 56 |
TMR | 25 | 1 | 66 | 26 | 7 | 27 | 2 | 25 | 0 | 67 |
SA | 36 | 1 | 56 | 38 | 6 | 37 | 2 | 36 | 0 | 58 |
FS | 43 | 2 | 54 | 42 | 2 | 42 | 1 | 43 | 0 | 55 |
RHI | 44 | 2 | 53 | 45 | 2 | 41 | 1 | 44 | 0 | 52 |
CIR | 54 | 2 | 43 | 39 | 2 | 51 | 1 | 54 | 0 | 43 |
CPP | 45 | 2 | 53 | 44 | 1 | 40 | 1 | 45 | 0 | 51 |
PDC | 43 | 2 | 54 | 45 | 2 | 44 | 1 | 43 | 0 | 52 |
ECI | 44 | 2 | 53 | 43 | 2 | 41 | 1 | 41 | 0 | 54 |
PTC | 33 | 3 | 59 | 34 | 5 | 32 | 3 | 30 | 0 | 57 |
IP | 34 | 3 | 58 | 37 | 5 | 29 | 3 | 33 | 0 | 59 |
RRC | 54 | 2 | 43 | 39 | 2 | 51 | 1 | 54 | 0 | 43 |
RRR | 45 | 2 | 53 | 44 | 1 | 40 | 1 | 45 | 0 | 51 |
ICP | 43 | 2 | 54 | 45 | 2 | 44 | 1 | 43 | 0 | 52 |
ICR | 44 | 2 | 53 | 45 | 2 | 41 | 1 | 44 | 0 | 52 |
IUS | 30 | 2 | 60 | 33 | 8 | 32 | 2 | 31 | 0 | 61 |
The rating scales used for evaluation are Highly satisfied (HS), Satisfied (Ss), Neutral, (Ns), Dissatisfied (Ds), and Highly dissatisfied (HD). The members of ETG who applied RMCPS were highly satisfied than CTG members who did not use RMCPS. The overall results show that RMCPS satisfaction for customer needs and quality yield good outcomes than without RMCPS.
To get answers to RQ2, we compared the results by statistical analysis with other techniques, i.e., Analytical Hierarchy Process (AHP). AHP prioritizes requirements in a pair-wise manner based on importance, penalty, cost, time, and risk, whereas the clustering method, divides a given set of data into several clusters to determine the relative closeness between those objects. Thus, for comparison of techniques, CTG adopted AHP and clustering method for requirement management process of integrated components to validate their functionalities.
Contributors | RMCPS | AHP | Clustering |
---|---|---|---|
RA | 87 | 43 | 43 |
PM | 84 | 51 | 53 |
Sr | 65 | 52 | 54 |
TL | 69 | 54 | 53 |
Ds | 70 | 57 | 59 |
QA | 72 | 59 | 58 |
For validating the questionnaire, we conducted a reliability test using the SPSS software 23 tool (To automatically calculate/solve reliability value of data for statistical analysis) to highlight the significant differences among the existing and our proposed RMCPS frameworks. For statistical analysis, we generated some null hypothesis (H) to check the reliability of collected data and the significance of RMCPS. These hypotheses are as follow:
H01: The TMR has no change using RMCPS, AHP, and Clustering methods. H02: The CIR has no change using RMCPS, AHP, and Clustering methods. H03: The RHI has no change using RMCPS, AHP, and Clustering methods. H04: The CPP has no change using RMCPS, AHP, and Clustering methods. H05: The PDC has no change using RMCPS, AHP, and Clustering methods. H06: The ECI has no change using RMCPS, AHP, and Clustering methods. H07: The PTC has no change using RMCPS, AHP, and Clustering methods. H08: The RRR has no change using RMCPS, AHP, and Clustering methods. H09: The RRC has no change using RMCPS, AHP, and Clustering methods.
The statistical results are shown in
Cronbach's alpha | Cronbach's alpha based on standardized items | No. of items |
---|---|---|
0.911 | 1.00 | 12 |
To understand the implications and importance of various approaches like RMCPS, AHP, CV, we performed paired sample test on each of them and results. The results show diverse means of all groups, i.e., 0.72400 of RMCPS; while AHP and Clustering methods (i.e., 0.36236 and 0.47512 respectively) and interpreted that RMCPS values disperse less from their mean value and are more reliable than other methods, i.e., AHP and Clustering.
All tests have different t and means values which prove that experiment performance is reliable and unbiased without any ambiguity. As the results described, the mean value of the RMCPS approach is more than without RMCPS in both datasets, i.e., (0.728 and 0.744) and (0.356 and 0.370) respectively. Therefore, our proposed framework improves the process of specification and prioritization of configurable CBS.
Therefore, for RQ3 we used F-Measures and accuracy metrics [
The drawbacks of these case studies are that they are written in different programming languages, have different degrees of scalability, and have fewer expert participants available for text mining techniques. Consequently, in an empirical evaluation, many threats emerge that could cast doubt on the findings’ theoretical rationality. Therefore, it is essential to repeat the study to accept or refute decisions. Internal validity (IV), external validity (EV), construct validity (CV), and reliability validity (RV) are the four main threats [
IV is concerned with considerations relating to the organization of requirements. To counteract this challenge, mitigation measures must be implemented to avoid the use of disparate measures for specification and prioritization. RMCPS strengthens the CPSS and CPSP processes, according to the findings of our research. In comparison to the used example for evaluation, EV refers to the generality of the results in actual projects. By replicating the PF measures in many cases, the validity of the findings is increased. The relationship between the different concepts and reflections is considered by CV. This necessitates the implementation of different criteria to determine the validity of various practices, such as semantic analysis and prioritization in PF to estimate output against other techniques. The relationship between behavior and effect is referred to as RV. This can be mitigated by performing a detailed review of the different decisions used in PF authentication. Data was collected by all the writers, and countermeasures were taken in the review of the findings. To avoid TV threats, we used an experiment to assess the learning effect, which may have affected the findings, as well as a qualitative study to minimize biases.
Software development is a complex activity and with the advancements, in technological infrastructures, innovative practices need to be designed. With the recent popularity of CPSs, there is a need to optimize software engineering as well. To improve the requirement specification and prioritization processes of highly configurable components, we proposed an RMCPS framework. This framework reduces the complexity of large set components reuse in interface interaction, and we employed semantic-based analysis for multi-user viewpoint satisfaction for CPS. We used text mining for specification and CBR to prioritize components to enhance the functionalities of products components with limited resources and high agility to reduce development time. To evaluate the effectiveness of our proposed framework, we experimented, and the results of the experiment showed a large mean difference (>0.5) and higher satisfaction (<0.5) in RMCPS as compared to traditional approaches. Thus, the proposed framework enhances the CPSS and CPSP activities and provides a roadmap for researchers and industry practitioners in the domain of CPS specification and prioritization activities. As future work, we intend to extend our proposed framework in a globally distributed environment to resolve specification and prioritization issues. Furthermore, we intend to mitigate the issue of quality analysis after continuous modification in CBSD in the cloud computing environment as well.