Novel Ransomware Hiding Model Using HEVC Steganography Approach

: Ransomware is considered one of the most threatening cyberattacks. Existing solutions have focused mainly on discriminating ransomware by analyzing the apps themselves, but they have overlooked possible ways of hiding ransomware apps and making them difficult to be detected and then analyzed. Therefore, this paper proposes a novel ransomware hiding model by utilizing a block-based High-Efficiency Video Coding (HEVC) steganography approach. The main idea of the proposed steganography approach is the division of the secret ransomware data and cover HEVC frames into different blocks. After that, the Least Significant Bit (LSB) based Hamming Distance (HD) calculation is performed amongst the secret data’s divided blocks and cover frames. Finally, the secret data bits are hidden into the marked bits of the cover HEVC frame-blocks based on the calculated HD value. The main advantage of the suggested steganography approach is the minor impact on the cover HEVC frames after embedding the ransomware while preserving the histogram attributes of the cover video frame with a high imperceptibility. This is due to the utilization of an adaptive steganography cost function during the embedding process. The proposed ransomware hiding approach was heavily examined using subjective and objective tests and applying different HEVC streams with diverse resolutions and different secret ransomware apps of various sizes. The obtained results prove the efficiency of the proposed steganography approach by achieving high capacity and successful embedding process while ensuring the hidden ransomware’s undetectability within the video frames. For example, in terms of embedding quality, the proposed model achieved a high peak signal-to-noise ratio that reached 59.3 dB and a low mean-square-error of 0.07 for the examined HEVC streams. Also, out of 65 antivirus engines, no engine could detect the existence of the embedded ransomware app.


Introduction
One of the main challenges facing the digital transformation of almost all our life aspects is cybersecurity attacks. Such attacks are launched in many different ways. A common source of attacks is malicious software (malware) that harms the users' devices and data. The cyberthreats of malware are characterized into several types such as Trojan, Spyware, and Adware [1]. Among the most threatening cyberattacks is Ransomware which is a form of malware that blocks access to the victim's device or data until a ransom is paid, consequently gaining an astonishing growth in causing monetary loss against individuals, businesses, and governments [2]. Generally, ransomware is categorized into two main types crypto-ransomware and locker-ransomware. The crypto-ransomware encrypts the user's sensitive information and requests payment to retrieve the decrypted data. On the other hand, the locker-ransomware blocks the interaction with the victim's device by displaying a lock screen window. Subsequently, the lock window is only removed after a ransom is paid [3]. However, the recent success in Ransomware results in the appearance of new families [4].
Many research solutions have been proposed to detect ransomware attacks [5][6][7]. These solutions have either utilized permissions [5] or API package calls [7] or both [8] to apply static or dynamic analysis for applications, whether benign or ransomware. Additionally, they have applied machine learning algorithms to build effective ransomware detection systems [6,9]. However, the current ransomware detection solutions have assumed that the application is visible to be analysed. They did not investigate the possibility of hiding this ransomware and making it difficult to apply static or dynamic analysis.
In the context of malware in general, there are some attempts by the developers to create well-established techniques to bypass the detection systems. One of the utilized techniques was malicious components or activity hiding using steganography, which concealed the presence and communication between the active malware application and the attacker [10]. The existing steganography techniques can be categorized according to how the hidden communication is implemented into three main groups [11]: (a) techniques that hide malware by mimicking benign software, (b) techniques that inject one or more component into the network traffic, and (c) techniques that hide the malware or part of its components in a digital media file. However, driven by the vigorous expansion of multimedia, video steganography is gaining momentum gradually. Furthermore, research interest increases in utilizing video streaming due to its low-quality loss and high embedding capacity. Specifically, the high-efficiency video coding (HEVC) standard [12,13] which provides a high bit-rate reduction.
The significant spread of ransomware and the possible advances of its anti-detection techniques creates an urgent need for further investigation in this field. Furthermore, applying steganography algorithms to embed ransomware applications increases ransomware risk with respect to individuals and businesses where anti-viruses software might fail in detecting the hidden ransomware. Accordingly, the major contributions of this paper can be summarized as follow: • Deep investigation in the literature to check if there are any attempts to hide ransomware applications. • Conduct a comparative analysis among existing techniques utilizing steganography in hiding malicious data. • Propose an efficient, novel approach to hide complete ransomware using block-based HEVC steganography. • Apply comprehensive subjective and objective tests to evaluate the proposed approach.
• Obtain high similarity between the original video and the corresponding video after embedding the ransomware; as part of the subjective tests' results. • Achieve high performance in terms of 16 metrics used to assess the quality of the video after embedding the ransomware; as part of the objective tests' results. • Bypass 65 well-known Antivirus engines by the embedded ransomware video; as part of the security tests.
The rest of the paper is organized as follows. Section 2 provides a comparison among previous suggested works on steganography. Section 3 presents the proposed HEVC steganography-based ransomware hiding approach. Section 4 presents the approach evaluation and results' discussions. Finally, Section 5 concludes the paper and suggests possible future work.

Literature Review
This section highlights different steganography techniques presented in the literature [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. Tab. 1 presents a comparison among several proposed steganography schemes in terms of the proposed solution's main goal, the implemented steganography technique, the type of both cover media object and the secret message, and the used evaluation metrics. Sahu et al. [15] proposed a dual-layer steganography system by applying a reversible information hiding (RIH) technique utilizing the least significant bit (LSB) in the hiding process. In the first layer, each pixel of the secret image is hidden within two bits of the cover data by implementing the LSB matching algorithm resulting in a pair of intermediate pixels.
Subsequently, during the second layer of embedding, this pair was used to hide four bits of the secret information. According to the conducted evaluation experiments, applying reversible information hiding in dual-layer resulted in high efficiency in information hiding. Hindi et al. [16] also, used an image to hide a secret message . However, in their proposed work, they have utilized two keys of eight decimal digits in implementing the hiding/extraction process aiming to enhance the level of security.
Moreover, in [17], a three components (Red, Green, Blue) RGB channel-based secret data partition was proposed to adaptively allocate the capacity of the hidden message between the RGB channels to hide an image inside an image without affecting the performance of the hiding process. Besides using a single image as a cover object, Liao et al. [18] developed a multiple images steganographic system in which it utilizes the features of the image texture. To distribute the secret data in multiple images, they have implemented an adaptive payload distribution technique. Furthermore, two payload division methods were proposed; distortion distribution (ES-DD) and image texture complexity (ES-ITC). However, experiment results showed that the proposed scheme provides an enhanced security performance.
The advances in video coding applications have raised the interest in video steganography [19][20][21][22][23]. In [19], the authors concealed a video in another video by employing the inter-frame references of the cover video. While creating the steganographic video, a novel technique for modeling the temporal residual was implemented to fully benefit from the sparse characteristic of the differences between the inter frames. The authors in [20] implemented video steganography by utilizing the intra-prediction mode (IPM) feature of HEVC for cover selection. The stego video stream combined the prediction unit of HEVC and the coding unit to implement the cover selection process. Another application of HEVC steganography was proposed by [21] in which three intra-frame prediction modes were combined to enhance the visual quality of the carrier video. Zhang et al. [22] also used prediction units (PU) of HEVC in implementing video steganography. However, to overcome the capacity limitation of the PU, they have modified the exploiting modification direction; consequently, two prediction units were combined, thereby enlarging the PU capacity. An additional suggested solution that focuses on obtaining a highresolution HEVC stego video streaming was proposed by [23]. In order to conceal information without affecting the video quality, they modified the bits of the luminance intra-blocks.
Steganography can be further used as a technique that hides malicious software to increase its undetectability level. Network steganography plays a significant role in malware information hiding in which one or more components of the malicious software are embedded in the network traffic [24]. In [25], a hidden communication system was implemented utilizing the StegBlocks technique to perform text communication between the attacker and the active malicious software. However, the attacker output text file is restricted to 23 kB. Another network steganography application was proposed by [26], which implemented a masking technique to conceal the malware application identity. In the proposed scheme, the tunnel generates fake traffic that simulates normal network traffic encapsulating the actual malicious traffic. In addition to network steganography, the digital media steganography technique was used to embed the malware software by altering the carrier media file structure. In [27], they developed a Malware utilizing Metasploit operating system, then they embedded the malware in an image. Subsequently, they performed detection analysis using Open Source Intelligence Tools (OSIT) such as VirusTotal. However, even though the result showed an enhancement in hiding the malware, yet, some virus scanners software detected the malware. Stergiopoulos et al. [28] used a different digital media carrier to embed malware. They injected malware apps via audio frequencies. However, the proposed system needs to meet certain conditions such as high speaker volume and low noise environment in order to extract the injected malware.
In the light of the above discussion, few works have been focused on utilizing and investigating steganography techniques to hide malware software. Furthermore, no research was found that has discussed the ability to embed ransomware software applications in digital media files. In this work, a novel system is proposed to utilize HEVC videos as a cover media to conceal the ransomware software applications with high efficiency.

Proposed HEVC Steganography-Based Ransomware Hiding Model
This section introduces and discusses the proposed block-based HEVC steganography approach for hiding ransomware applications. This ransomware hiding approach is built based on the image steganography algorithm presented in [29]. The proposed steganography-based hiding approach consists of two main processes; ransomware embedding process (REP) and ransomware extraction process (RExP) as shown in Fig. 1. The REP starts by selecting the proper media cover (HECV video) and randomly extracting one of the video frames. The selection of cover video frames depends on the size of the ransomware sample and the capacity of the cover video. Therefore, prior to the embedding process, the proper video frame is selected based on its resolution in order to hide the ransomware sample without affecting the main features and quality of the cover frame. After that, the chosen frame is forwarded as an input to the embedding phase, as detailed in Algorithm (1).
-Divide both secret ransomware data and input cover HEVC frame into different blocks.
• Divide K-pixels cover video frame into different blocks of refers to the pixel value within each block, LSB j1 , LSB j2 , . . . , LSB jk , LSB (j k+1) refers to the LSB of each pixel value within each block, and LSB (j k+1) which is the marked bit of each block Q j . • Divide the secret ransomware data into blocks of S j (j = 1, 2, . . . , k), where S j1 , S j2 , . . . , S jk refers to the binary bits of each block within the secret data. for all divided blocks, do -Initialize the LSB-based steganography cost function.
-Calculate the hamming distance (HD) using Eqs. (1) and (2) between the LSB j1 , LSB j2 , . . . , LSB jk , LSB (j k+1) of Q j , and their corresponding S j1 , S j2 , . . . , S jk bits of S j . So, HD j represents the number of total differences between the LSBs of Q j and the related bits of S j .
-Calculate the value of the stego pixel P (Q ji , S ji ) using Eq. (4).
-Gather the stego blocks Q j that have pixels values included the secret ransomware data. else if HD j ≥ k/2, then -Set the marked bit LSB (j k+1) to 1.
-Gather the stego blocks Q j that have pixels values included the secret ransomware data. end if end for output: Stego HEVC frame.
The output of this algorithm is the stego frame, which is the frame that is injected with the ransomware application (apk). The stego frame's quality will be deeply assessed by examining 16 different metrics. If the stego frame passes the quality check, it will be combined with the rest of the frames to restore the complete video. But, this video is now infected with ransomware.
In contrast, the RExP starts by taking the stego HEVC video as an input to extract the stego frame and forward it to the ransomware extraction phase, as detailed in Algorithm (2). This algorithm's outputs are the ransomware application itself and the original frame. The original frame is then combined with the rest of the frames to restore the original clean HEVC video.

Algorithm (2):
Steps of the extraction phase input: Stego HEVC frame. The main idea of the utilized steganography approach was dividing the secret ransomware and the cover HEVC frame into different blocks. The cover HEVC frame used to embed the secret ransomware data is selected randomly. After that, the LSB-based hamming distance calculation is performed amongst the divided blocks of the secret data (ransomware) and the cover frames. Finally, the secret data bits are hidden into the marked bits of the cover HEVC frame-blocks based on the estimated hamming distance value. The major improvement of the introduced HEVC steganography approach was utilizing an adaptive steganography cost function that reduced the embedding influence of ransomware hiding within the stego HEVC frames by conserving the histogram features of the cover frames while introducing a desirable imperceptibility. Furthermore, this approach accomplishes high capacity and superior hiding efficacy by ensuring the undetectability of the hidden ransomware data within the video frames. More quality evaluation metrics are examined to assess the performance of the REP and RExP processes. In case the assessment metrics of the stego frame did not achieve the desired and expected values, the REP process is repeated. This is to select a more suitable resolution of the cover video frame concerning the size of secret ransomware data to achieve higher perception quality and adequate capacity performance.

Model Evaluation and Result Discussions
This section presents the features of the ransomware samples and the standard cover HEVC streams used in this research experiments. Also, it lists the subjective and objective evaluation metrics that were applied to examine the performance of the proposed hiding approach, as shown in Fig. 2. The objective-based evaluation included 16 metrics to assess the quality of the resulted stego frame. Moreover, 65 Antivirus engines were used to scan the stego frame and the stego video to check if these engines can detect the ransomwares' existence. The subjective-based tests were also considered in this study by comparing (a) the original frame and the stego frame and (b) the original video and the stego video. Finally, the results of all metrics will be presented and analysed.

Ransomware Samples and Standard HEVC Streams
To prove the efficiency of the proposed ransomware hiding approach, many experiments were conducted. In these experiments, we utilized different ransomware samples as secret messages and different HEVC streams 1 with various resolutions as cover media. The purpose was to check the capability of the proposed approach in hiding different ransomware sizes within different resolutions of cover video frames without affecting the main features and quality of the cover frames. Additionally, the proposed approach aims to achieve high secrecy of the ransomware by making it undetectable even by specialized antivirus engines. Tab. 2 presents the sizes of the tested ransomware samples, while Tab. 3 introduces the resolutions of the tested HEVC streams.

Quality Assessment
As part of the objective-based evaluation, 16 metrics are mathematically presented in this section. Throughout the following equations, x (m, n) signifies the original cover video frame, x (m, n) represents the resulted stego video frame, and M and N are the numbers of the pixels in rows and columns, respectively.  MSE [30] is one of the quality assessment metrics that are used in image and video quality evaluation applications. It is used to estimate the error between the cover and stego video frames. A lower value of the MSE metric means that the video frame has a good quality, and there is a higher similarity between the cover and stego video frames. This metric is mathematically represented in Eq. (6):

• Peak Signal to Noise Ratio (PSNR)
The PSNR metric [30] is a function of the MSE metric. So, it is preferable to get a large PSNR value to obtain a good quality for the resulted stego video frame. The PSNR is measured in decibels and it is represented in Eq. (7): • Signal-to-Noise Ratio (SNR) SNR [31] is described as the ratio between the two average powers of signal and noise. It is measured in decibels and it is presented in Eq. (8): It is superior to achieve a large SNR value to obtain a good quality for the resulted stego video frame.

• Weighted Signal-to-Noise Ratio (WSNR)
The WSNR metric is a weighted form of the SNR metric which is developed by Varkur and Mitsa [32] utilizing the sensitivity contrast function. It is the ratio between the average weighted powers of the signal and noise, respectively. It is measured in decibels and it is better to obtain a large WSNR value to attain a good quality for the resulted stego video frame.
• Noise Quality Measure (NQM) NQM [33] is used to determine the distortion caused to a video frame due to both frequency shift and noise effect. Also, the NQM metric can be employed to estimate the effects of local luminance, contrast perception, texture masking, and contrast masking. Thus, it can be considered a weighted version of the SNR metric between the cover and stego video frames. Consequently, it is desirable to get a large value of NQM to obtain a good quality for the resulted stego video frame. The NQM is measured in decibels and it is represented in Eq. (9): The SC metric [34] is the ratio of the power of the original signal (cover video frame) to the power of the processed signal (stego video frame). So, it is preferable to obtain a small SC value to get good quality for the resulted stego video frame. It can be defined as in Eq. (10): The MD metric [35] determines the maximum amount of error in the processed signal compared to the original signal. It estimates the difference between the reference cover video frame and the processed stego video frame. Thus, it is preferable to obtain a small MD value to get a good quality of the resulted stego video frame. It is defined in Eq. (11): • Normalized Absolute Error (NAE) The NAE metric [30] is the ratio between the MD metric and the absolute value of the reference cover video frame. For achieving a good quality of the stego video frame, it is recommended to get a low value of the NAE metric. It is represented in Eq. (12): • Laplacian Mean Square Error (LMSE) LMSE evaluation metric [36] is based on estimating the measurement of video frame edges. It is better to achieve a small LMSE value to obtain a good quality for the resulted stego video frame. The LMSE metric is mathematically represented in Eq. (13): x (m, n))] 2 (13) where the Laplacian operator is symbolized by L(x(m, n)) for the signal x(m, n), and it is given as: • Structural Similarity Index (SSIM) The SSIM metric [37] is utilized to estimate the visual effect of the luminance shift, contrast changes, and structural alterations of a video frame. So, it is used to extract the structural information of the objects inside the input video frame. Thus, a degree of estimated structural similarity is a clear indication of the recognized video quality. The SSIM metric between the cover and stego frames of x and x signals is described in Eq. (15): where s x, x , c x, x , and l x, x refer to the structural, contrast, and luminance components of the video frame index, respectively. They are represented as: where c 1 , c 2 , and c 3 are positive small constants, μ x and μ x signify the means of the cover and stego video frames, respectively. σ x and σ x signify the standard deviations of the cover and stego video frames, respectively. σ xx is the covariance between the cover and stego video frames. For an 8-bit grayscale video frame combined of L = 2 8 gray-levels, c 1 = (k 1 L) 2 , c 2 = (k 2 L) 2 , and c 2 = c 3 = 2, where k 2 = 0.03 and k 1 = 0.01. It is noticed that in the case of c 1 = c 2 = 0, the SSIM metric is reduced to the Universal Quality Index (UQI) metric. The range of the SSIM metric is −1 to 1. Therefore, obtaining a high value of SSIM indicates high similarity between the cover and stego video frames.

• Multi-Scale SSIM Index (MS-SSIM)
The MS-SSIM metric is considered as an improved version of the SSIM metric. It is devised to determine the visual quality of a video frame based on multiple scales [38]. So, it has different forms of scales. The lowest scale is utilized to measure the luminance component, whilst the structural and contrast components are determined based on the j scale, and it has also the highest scale represented as M. The range of the MS-SSIM metric is −1 to 1. Therefore, obtaining a high value of MS-SSIM indicates high similarity between the cover and stego video frames.
• Feature Similarity Index (FSIM) The FSIM metric [39] is utilized to extract the low-level features within a video frame such as gradient magnitude and phase congruency. The gradient magnitude composes the contrast information, while the phase concurrency contains great information of the primary features. The range of the FSIM metric is −1 to 1. So, achieving a high value of FSIM means a high similarity between the cover and stego video frames. The FSIM metric is described in Eq. (19): where the gradient magnitude information can be estimated using the Sobel operator S L (x), the spatial domain of video frames is provided by , and the projected phase congruency information can be determined by PC m (x).
• Universal Quality Index (UQI) UQI [40] is global instead of being local or specifically intended for the video frames being examined or on the particular observers. Therefore, in quality assessment evaluation for image and video applications, the UQI is recommended to be utilized for quality assessment where it composes the correlation, luminance, and contrast components as it is determined as in Eq. (20): So, the UQI metric is defined in Eq. (21): The range of the UQI metric is −1 to 1, so, there is a higher similarity between the cover and stego video frames in the case of obtaining a higher UQI value.
• Normalized Cross Correlation (NK) The NK metric [41] is used to compare the processed stego video frame and the reference cover video frame. It is expressed in Eq. (22): For the success of the steganography process, it is preferable to get the highest value of 1 between the cover and stego frames to achieve higher performance efficiency.
• Average Difference (AD) The AD metric [42] determines the average variation between the reference cover video frame and the processed stego video frame. So, it is desirable to get a smaller value of AD to obtain a good quality for the resulted stego video frame. It is calculated in Eq. (23): •

Pixel-Based Visual Information Fidelity (VIFP)
The VIFP metric is an improved version of the Visual Information Fidelity (VIF) metric with a low computational cost. It is used to extract and compare the pixel-level information within the cover and stego video frames [43]. It is preferable to get the highest value of 1 between the cover and stego frames to accomplish the high performance of the employed steganography process.

• Entropy (E)
The entropy metric is utilized to estimate the amount of information in the cover and stego frames. It is preferable to get identical entropy values for the stego and cover frames. It is calculated in Eq. (24): where the j th grey frame value is denoted by m j and the probability of m j in a video frame is given by P(m j ).

Results Discussion
This section presents and discusses the results of all examined evaluation metrics considered in this study.

Video Quality Assessment
To evaluate the employed HEVC steganography approach, we performed various experiments using the different HEVC streams and ransomware samples that were presented in Section 4.1.
Tab. 4 presents the subjective findings of the tested HEVC frames with distinct resolutions in case of hiding five ransomware samples with different sizes, while Tab. 5 introduces the histogram findings.

Race (frame 25) in case of using ransomware1
It is observed from the introduced results in Tab. 4 that the suggested steganography approach achieves high imperceptibility results, where the stego frames are visually similar to the cover frames with a minor difference in their entropy values. This can also be observed by the obtained difference frames between the cover and stego frames, where their entropies (the amount of information) have very low values near to zero. This is clearly shown by the completely black pixels in the resulted difference frames.
Furthermore, the acquired histogram results in Tab. 5 further prove the imperceptibility efficacy amongst cover and stego frames by achieving approximately the same pixel intensity distributions with similar histograms. Moreover, it is also demonstrated that there is no pixel distribution of the obtained histograms of the difference frames except a low distribution around the zero-pixel value. Tab. 6 provides the objective quality assessment results of the tested video streams after embedding the ransomware samples. The table shows the results of the 16 different evaluation metrics that assess the quality of the stego frames. The targeted optimal values to be achieved by each of these metrics are also listed in Tab. 6. Therefore, the obtained results greatly declare that the employed steganography approach achieves significant performance. This is revealed by attaining low values of MSE, SC, MD, LMSE, NAE, and AD metrics and accomplishing high values of PSNR, SSIM, UQI, FSIM, NQM, NK, SNR, VIFP, WSNR, and MS-SSIM in all tested video streams.

Antivirus Scan
The antivirus scanning was performed using the VirusTotal 2 platform as part of the security test. VirusTotal conducts malware detection scanning utilizing over 65 antivirus scanning vendors such as Kaspersky, McAfee, Avast, Symantec, and many others.
In this experiment, the scanning has been implemented in three different stages. Initially, the original ransomware was scanned before hiding it inside the cover video frame. Following that, both the video frame with the embedded ransomware file (stego frame) and the combined video (stego video) were scanned to investigate the effectiveness of the applied steganography algorithm.
The results of VirusTotal scanning are demonstrated in Fig. 3. As it can be seen, a total of 35 out of 65 engines detected the ransomware file before applying the steganography algorithm (Fig. 3a). However, the ransomware was not detected by any engine after embedding it within the video frame (Fig. 3b). Furthermore, the antivirus scan of the combined HEVC stream, where the ransomware is concealed inside the video frame, shows that none of the VirusTotal engines was able to detect it (Fig. 3c). In addition to the antivirus scan, we managed to upload and stream the stego videos through the research lab YouTube channel 3 , which is another proof of bypassing the existing security checks. This stresses the high efficiency of the proposed hiding approach.

Conclusion and Future Works
This paper has proposed an efficient, novel ransomware hiding approach using HEVC steganography. This work highlighted the shortcomings of the existing ransomware detection systems as they did not investigate the possibility of hiding the ransomware itself and finding ways to detect it, extract it, and then analyze it. Therefore, this work has utilized steganography and, in specific video steganography to hide ransomware with high efficiency in terms of (a) preserving the quality of the video and its characteristics after embedding the ransomware (b) protecting the privacy of the ransomware itself by making it difficult to be detected even by well-known antivirus engines. The proposed hiding approach was heavily examined using different subjective and objective metrics and embedding different ransomware samples into video covers with various resolutions. The results revealed that the proposed approach succeeded in hiding ransomware and bypassing all quality and security tests. As future work, different steganography approaches can be experienced to hide new ransomware families or different malware apps in general. Also, an encryption stage can be added to encrypt the ransomware samples before embedding them within the cover video frames. Furthermore, different formats of multimedia files (e.g., image and audio) may be utilized as cover media. Moreover, advanced artificial intelligence tools and well-trained deep learning models can be utilized for testing the possibility of detecting the hidden ransomware apps within video frames.