Deep Learning Based Stacked Sparse Autoencoder for PAPR Reduction in OFDM Systems

Orthogonal frequency division multiplexing is one of the efficient and flexible modulation techniques, and which is considered as the central part of many wired and wireless standards. Orthogonal frequency division multiplexing (OFDM) and multiple-input multiple-output (MIMO) achieves maximum spectral efficiency and data rates for wireless mobile communication systems. Though it offers better quality of services, high peak-to-average power ratio (PAPR) is the major issue that needs to be resolved in the MIMO-OFDM system. Earlier studies have addressed the high PAPR of OFDM system using clipping, coding, selected mapping, tone injection, peak windowing, etc. Recently, deep learning (DL) models have exhibited improved performance on channel estimation, signal recognition, channel decoding, modulation identification, and end-to-end wireless system. In this view, this paper presents a new Hyperparameter Tuned Deep Learning based Stacked Sparse Autoencoder (HPT-SSAE) for PAPR Reduction Technique in OFDM system. The proposed model aims to substantially reduce the peaks in the OFDM signal. The presented HPT-SSAE model is utilized to adaptively create a peak-canceling signal based on the features of the input signal. In the HPT-SSAE model, the constellation mapping and demapping of symbols take place on every individual subcarrier adaptively using the SSAE model in such a way that bit error rate (BER) and the PAPR of the OFDM systems are cooperatively diminished. Besides, to enhance the performance of the SSAE model, the hyperparameter tuning process takes place using monarch butterfly optimization (MBO) algorithm. A comprehensive set of simulations were performed to highlight the supremacy of the HPT-SSAE model. The obtained experimental values showcased the betterment of the proposed model over the compared methods interms of bit error rate (BER), complementary cumulative distribution function (CCDF), and execution time.


Introduction
The integration of OFDM with Multiple-Input Multiple-Output (MIMO) wireless communication system leads to the design of MIMO-OFDM system [4], a significant technique for wireless communication systems due to maximum data rate. Besides, the OFDM reduces the computational complexity of the MIMO transceiver through the transformation of a frequency selective MIMO channel to a collection of parallel frequency flat MIMO channels [5]. But the transmission of signals in the OFDM method whereas the outcome is the superposition of many subcarriers through an Inverse Fast Fourier Transform (IFFT) function encompasses high peak-to-average power ratio (PAPR). It is a major limitation that exists in the design of OFDM systems. When a transmitting device has maximum PAPR, the average power gets considerably minimized with reference to persistent saturation energy.
With the recently available commercial wireless system, the PAPR problem is highly important in uplink [6] due to the fact that it is the limiting connection with respect to range and coverage. Since the mobile terminals are restricted to inbuilt batteries, the effectiveness of the power amplification device is essential. Fig. 1 shows the structure of OFDM system [7].
An inclination in 5G allows high frequency bands to attain extra unused spectrum, and several researches have been carried out to accomplish it [8]. In the forthcoming 5G smartphones where beam forming concept is utilized, the reduction in PAPR is essential assuming the minimum power efficiency of mm Wave PAs and insignificant battery efficiency is studied in Huo et al. [9]. Furthermore, in case of tactical communication, the coverage is a serious point, and vehicle-to-vehicle broadband data transmission necessitates a robust output power. The issue that exists is that power amplifier (PA) connected to maximum power scope is highly expensive. Consequently, a real-time design of OFDM needs to address all ways to minimize the high PAPR. Several works have resolved the issue using distinct mechanisms like clipping and filtering, selected mapping, coding, peak windowing, clipping, tone injection, and Partial Transmit Sequence (PTS). Each of these techniques possesses their individual merits and demerits. However, the PTS model is considered an effective method owing to its improved performance in PAPR reduction. Recently Machine Learning (ML) models have been emerged in the design of OFDM systems to accomplish superior PAPR reduction. This paper develops a novel Hyperparameter Tuned Deep Learning based Stacked Sparse Autoencoder (HPT-SSAE) for PAPR Reduction Technique in OFDM system. The goal of the proposed HPT-SSAE model is to train the network in such a way as to minimize PAPR with no degradation of the BER. The presented HPT-SSAE model is employed to dynamically generate a peak-canceling signal depending upon the characteristics of the input signal. For further improving the efficiency of the SSAE model, the hyperparameters can be tuned by the use of monarch butterfly optimization (MBO) algorithm. Extensive experimental analysis of the HPT-SSAE model takes place to point out the betterment of the HPT-SSAE model. In short, the contribution of the paper is given as follows.
Propose a new HPT-SSAE based PAPR Reduction Technique for OFDM system Aims to substantially reduce the peaks in the OFDM signal Employ SSAE model for the peak cancelling signal generation Perform hyperparameter tuning of SSAE model using MBO algorithm Validate the results of the HPT-SSAE model interms of different measures.

Literature Review
Al-Jawhar et al. [10] presented a novel PTS model for decreasing the high PAPR of the filtered OFDM systems. It has analyzed different parameters namely frequency localization, bit error rate (BER), and computation difficulty under the existence of with and without PTS. In Amhaimar et al. [11], a low complex PTS model is presented based on a Swarm Intelligence (SI) based Fireworks Algorithm (FWA). The FWA is an iterative technique that begins to execute till the stopping condition is satisfied and it involves four major parameters such as mutation operation, mapping rule, explosive operator, and selection approach. The obtained experimental values stated that the presented model has accomplished higher accuracy of PAPR and convergence rate compared to the traditional methods.
In Wang et al. [12], an effective DL based tone reservation network (TRNets) has been presented for OFDM system for enhancing the outcome of the TR scheme. Particularly, TRNet will reserve a portion of the tones for the generation of a peak-canceling signal. The feed forward neural network (FFNN) is applied for the adaptive generation of a peak-canceling signal based on the nature of an input signals. In Kim et al. [13], a new PAPR reducing network (PRNet) is presented depending upon the autoencoder (AE) model of the DL. Here, the constellation mapping and demapping of the symbols that take place on every individual subcarrier is computed in an adaptive way by the use of DL model in such a way to minimize the BER and PAPR of the OFDM systems.
Sohn et al. [14] designed a novel PAPR reduction technique which makes use of the time-domain kernel matrix for the generation of the PAPR-reduction signals. In addition, the generation considerable of the clipping noise is relaxed where the clipping noise includes a set of non-correlated parabolic pulse and employ the presented model. Depending upon the instantaneous observation of clipping noises, the presented model designs an easier time domain kernel matrix and applies a curve fitting technique for the optimization of the respective scaling factors. In Singh et al. [15], a novel Firefly (FF) technique is employed for searching the optimum combination of the phase vectors. The presented model offers an improved tradeoff amongst the enhanced PAPR results and computation difficulty over the PTS model for several sub-blocks.
For reducing the high PAPR, a PTS model depending upon the adaptive particle swarm optimization (PSO) algorithm is presented [16]. It effectively searches for the optimum integration of the phase rotation factors to reduce the computation complexity. In Xiao et al. [17], a new SFLAHC-PTS model is presented for the minimization of the PAPR of the signal. The presented model is an enhanced version of the PTS model that inherits the merits of the shuffled frog leaping optimization and hill-climbing algorithms for tuning the classical PTS model and reduces the computational complexity.
In Wang et al. [18], an effective scaling SCR (S-SCR) method is presented where the scaling factor is an optimized vector with peak regeneration constraint. For additional enhancements of the convergence rate and elimination of many peaks at every round, a multiple scaling SCR (MS-SCR) method is presented. In Kumar et al. [19], a PTS along with hybrid optimization technique called PS-GW is defined for obtaining minimum outcome on PAPR and computation complexity. The PS-GW algorithm is an integration of the PSO and Gray Wolf Optimization (GWO) algorithms that identifies the optimum combination of phase rotational factors in an efficient way. This method is based on the idea that the volume of exploitation in PSO is improved with the capacity of examination in GWO for the creation of two variations in quality.

The Proposed HPT-SSAE Model
A paper for publication should be divided into multiple sections: a Title, Full names of all the authors including their affiliations, a concise Abstract, a list of Keywords, Main text (including figures, equations, and tables), Acknowledgments, Funding Statement, Conflict of Interests, References, and Appendix. The suggested length of a manuscript is 10 pages. Each page in excess of 15 will be charged an extra fee. The transmission of signal using transceiver depending upon OFDM systems is a commonly employed technique. It partitions the efficient spectrum channels as to a set of orthogonal subchannels with equivalent bandwidth, every individual sub-channel autonomously manages the individual data utilizing separate subcarrier. In addition, the OFDM signals are the total of every independent subcarrier. With the multi-carrier signal transmission system, the input data of binary sequence undergo mapping to a set of symbols through a modulation technique. Next, the N symbols X ¼ X 0 ; X 1 ; X N À1 ½ T are appended to the IFFT element for the independent modulation of every individual subcarrier and attain the OFDM signal in time domain The complicated covering of the OFDM signal in the discrete time domain with oversampling factor L is defined by where N indicates the subcarrier count and X k is the nth complex symbol performed and sent through the kth sub-carrier.
In Eq. (1), the signal in the time domain produced by IFFT comprises N autonomously modulated and orthogonal sub-carriers with high PAPR in case of adding it to the outcome of the IDFT block [20]. The PAPR of the OFDM signals from discrete time can be represented as the ratio of maximal to average power (or excepted power E Á f gÞ of the difficult OFDM signal, as represented in Eq. (2): 3.1 Architecture of SSAE Model SSAE is a NN comprised of multiple SAEs connected in an end-to-end way. The output of the preceding layer of sparse self encoder is utilized as the input of the subsequent layer of self-encoder, therefore higherlevel feature illustrations of an input data are achieved. A greedy layerwise pretraining model is utilized for the sequential training of all layers of SSAE for accessing the optimization connection weights as well as bias values of the whole SSAE network. Fig. 2 illustrates the representation of AE.
Afterward, the error backpropagation (BP) technique is utilized for fine-tuning the SSAE till the outcome of error function among the input as well as output data fulfills the predictable necessities, in order to get the better parameter model. The error function J sparse W ; b ð Þ is determined by: So, the upgraded model of the weight as well as bias are given as follows: where X n ð Þ and Y n ð Þ are correspondingly signified as the nth actual vector and its equivalent reformation vector, and g implies the upgrade rate of learning.
Assume that there are sparse restraints from the SSAE model, it requires utilizing several rates of learning for various parameters like decreasing the frequency of upgrade to infrequent features. Fig. 3 demonstrates the architecture of SSAE [21]. But, the typical Gradient Descent (GD) technique contains Stochastic Gradient Descent (SGD) and mini-batch GD that utilize a similar learning rate to each network parameter which requires to upgraded, creating it complex to select the proper learning rate and simply attain the local minimal. In the SSAE, the encoding of the input data takes place at the constellation plane by the use of the encoder of SSAE, comprised of L f ¼ 5 sub-blocks. Therefore, the outcome of the encoding unit can be , where W f l f and b f l f indicates the weights and biases values of the l f -th FC of the encoding unit correspondingly. Next, the encoded symbols are passed via the IFFT operation that produces the transmitting signal. Afterward, the signal is sent via the wireless channel prior to the arrival at the receiving end. At last, the received signal gets passed to an FFT operation and undergoes decoding by the use of SSAE. It is considered that the decoding unit contains a set of L g ¼ 5 sub-blocks are as same as the encoding unit. Therefore, the outcome of the decoding unit, where y is the input of decoding unit and W g L g and b g l g are the weight and bias values of the l g -th FC of the decoding unit. The recreated symbol at the receiving endr, is represented as: where H signifies the effect of the wireless channel like multipath fading and thermal noise.

Hyperparameter Optimization of SSAE Model
In order to tune the parameters of the SSAE model, MBO technique is used. The MBO technique is a population based technique which belongs to the classification of SI techniques that are stimulated by the nature of specific species with swarm tendencies like bees, butterflies, etc. As above mentioned, the MBO was currently proposed by Wang et al. [22], which is depends on their concept of intelligent characteristics of butterfly, which is native to North America. It is featured by the attractiveness of the formation that comprises black and orange colors. The migratory performance of these butterflies is inspired to resolve the optimization problem. There are numerous guidelines and basic ideas that should be followed to attain the optimum result to the applied problem: 1. Each butterfly from the population is either existing in land1 (the home beforehand migration) or in land2 (the home afterward migration). 2. All children of every butterfly is made by the migration operator, nevertheless either the parent is existing in land1 or 2. 3. The population shouldn't alter and must be continuous forever, thus between 2 (novel child or parent) would be detached by a fitness function. 4. The butterfly is chosen depending upon fitness function are moved to the succeeding round and hasn't been altered by the migration operator.
The butterfly starts migration initially in April if they exit land1 and head to land2, and the inverse migration starts in September. The overall monarch butterflies in the lands denote the entire population that is termed as NP.

Migration Operator
The migration procedure of the butterfly is demonstrated as: where X tþ1 i:k denotes Kth component of Xiatt þ 1 generation, that illustrate the position of butterfly i, and X t r1:k denotes Kth component of novel generation position. Now, r denotes arbitrary number estimated as: where peri denotes migration period time. Alternatively, when r > p, after that the Kth components of novel generation position are estimated as [23]: where X t r2:k denotes Kth components of Xr2 at t generation of butterfly r2. Henceforth P denotes ratio of monarch butterfly in land1.

Butterfly Adjustment Operator
With these techniques, the tradeoff among the way of migration from land1 to land2 is attained by adapting the ratio of P value. When P is greater, it implies that the butterfly count would be chosen from land1 is larger than l land22, and vice versa. The location of butterfly is adapted when the created rand is lesser to or equivalent to P. The succeeding formula illustrates the upgraded location of butterflies position: where X tþ1 j:k denotes Kth components of Xj at t þ 1 generation, that illustrate the butterfly j position, and X t best:k denotes Kth components of Xbest at the present round t at land1 and land22. Now, when the random > P; afterward it can be upgraded as: where BAR denotes the adaptation rate of butterfly and dx represents walk step of j butterfly which is estimated with executing Lévy flight as: a in Eq. (6) denotes weighted factor which is estimated as: where S max indicates maximal length of butterfly enters step 1 and t denotes present generation. Fig. 4 showcases the flowchart of MBO technique [24].
Alternatively, when the rand is larger than BAR, the novel position is upgraded as:

Network Training on PAPR Reduction
The HPT-SSAE model undergoes effective training process to decrease PAPR and avoid the degradation of BER. Initially, the HPT-SSAE is required for reconstructing the broadcast signal from the received signal ensured that the BER remains same. Next, the HPT-SSAE produces a transmission signal which exhibits minimum PAPR [25]. To attain the first goal, the encoding unit of the HPT-SSAE model undergo training for determining the appropriate constellation mapping from input data, r k , to the output, X k , and the decoding unit of the HPT-SSAE needs decoding of the received signal. Afterward, the suitable loss function for achieving this aim can be equated in Eq. (16): where f Á; h f À Á and g Á; h g À Á are the parametric depiction of the encoding and decoding units correspondingly, and e indicates the noise at the receiver. Here, the weight matrices, bias, and activation layer are defined using simpler matrix functions of hidden node variables, i.e., h ¼ W ; b f g. By the training, h f and h g , i.e., the weights as well as biases of the HPT-SSAE are determined by minimizing the loss function therefore the constellation is robust to a random channel H is obtained through the encoding unit and an effective method of decoded the constellation mapping is attained using the decoding unit. At the same time, the loss function L 1 r ð Þ employed for achieving minimum PAPR can be represented in Eq. (17): Here, the training process takes place on two levels. During the initial level of training, the correct defining the ratio of noise power to signal power is found by loss function L 1 .
Later, in the next level of the training process, the weight and bias of the HPT-SSAE (h f and h g ) are learnt through the consideration of the joint loss function, L r;r ð Þ, integrating L 1 and L 2 such that the minimization of PAPR and BER takes place. Then, the L r;r ð Þ is defined by: where represents the weight parameter determining that loss is significant i.e., L 1 or L 2 , is dominant.

Experimental Validation
In this section, a detailed set of simulations were performed to highlight the better performance of the HPT-SSAE model with other existing methods such as original OFDM, with GA, and with FSO algorithms. The results are examined under different subcarriers such as 128, 256, and 512. Moreover, on determining the results with respect to SER, the experimental result represented that the OFDM technique has shown insignificant performance over all the other models by attaining higher SER. Simultaneously, the with-GA method has accomplished slightly decreased SER over the OFDM approach whereas even increased SER has been achieved by the with-FSO model. But the proposed HPT-SSAE model has resulted in effective performance and achieved a minimum SER. The HPT-SSAE model has reached the least SER of 10 -0.61 whereas the OFDM, with-GA, and with-FSO techniques have outperformed an improved SER of 10 -0.11 , 10 -0.45 , and 10 -0.53 correspondingly. Lastly, on determining the outcomes interms of CCDF, the experimental result indicated that the OFDM model has depicted insignificant performance over all the other techniques by obtaining superior CCDF. Likewise, the with-GA algorithm has accomplished somewhat reduced CCDF over the OFDM model whereas even higher CCDF has been reached by the with-FSO algorithm. But the proposed HPT-SSAE model has resulted in effective performance and achieved a lesser CCDF. The HPT-SSAE approach has achieved a minimum CCDF of 5.8 dB whereas the OFDM, with-GA, and with-FSO methods have outperformed an increased CCDF of 11 dB, 6.2 dB, and 7 dB correspondingly. Fig. 6 showcases the BER, SER, and CCDF analysis of the HPT-SSAE technique under the presence of 256 subcarriers. From the figure, it is clear that the HPT-SSAE model has accomplished effective performance under different aspects. On evaluating the results with respect to BER, the experimental outcome referred that the OFDM model has exhibited insignificant performance over all the other methods by attaining higher BER. In line with, the with-GA methodology has accomplished somewhat reduced BER over the OFDM method whereas even increased BER has been reached by the with-FSO algorithm. But the presented HPT-SSAE model has resulted in effective performance and achieved a minimal BER. The HPT-SSAE model has reached a minimum BER of 10 -1.56 whereas the OFDM, with-GA, and with-FSO techniques have demonstrated a maximum BER of 10 -0.56 , 10 -0.98 , and 10 -1.35 respectively. On measuring the outcomes interms of SER, the experimental result represented that the OFDM technique has illustrated insignificant performance over all the other methods by achieving superior SER. Similarly, the with-GA method has accomplished somewhat reduced SER over the OFDM Figure 5: Result analysis of 128 subcarriers (a) BER (b) SER (c) CCDF model whereas even improved SER has been reached by the with-FSO algorithm. But the presented HPT-SSAE manner has resulted in effective performance and achieved a minimal SER. The HPT-SSAE technique has reached the least SER of 10 -0.75 whereas the OFDM, with-GA, and with-FSO algorithms have demonstrated an increased SER of 10 -0.42 , 10 -0.57 , and 10 -0.62 correspondingly. At last, on determining the results with respect to CCDF, the experimental outcome shown that the OFDM model has displayed insignificant performance over all the other methods by attaining higher CCDF. Concurrently, the with-GA method has accomplished slightly decreased CCDF over the OFDM approach whereas even improved CCDF has been attained by the with-FSO model. However, the proposed HPT-SSAE methodology has resulted in effective performance and achieved a minimal CCDF. The HPT-SSAE model has reached the least CCDF of 5.2 dB whereas the OFDM, with-GA, and with-FSO techniques have demonstrated a maximum CCDF of 11 dB, 7.2 dB, and 6 dB respectively.  other methods by attaining higher BER. At the same time, the with-GA algorithm has accomplished slightly reduced BER over the OFDM model whereas even improved BER has been obtained by the with-FSO algorithm. But the presented HPT-SSAE model has resulted in effective performance and achieved a minimum BER. The HPT-SSAE model has reached a minimum BER of 10 -2.11 whereas the OFDM, with-GA, and with-FSO algorithms have demonstrated an increased BER of 10 -0.90 , 10 -1.15 , and 10 -1.35 respectively. On calculating the outcomes with respect to SER, the experimental result referred that the OFDM model has shown insignificant performance over all the other methods by attaining higher SER. At the same time, the with-GA algorithm has accomplished somewhat reduced SER over the OFDM model whereas even superior SER has been achieved by the with-FSO algorithm. But the projected HPT-SSAE model has resulted in effective performance and achieved a minimal SER. The HPT-SSAE algorithm has reached the least SER of 10 -0.8 whereas the OFDM, with-GA, and with-FSO algorithms have outperformed a higher SER of 10 -0.42 , 10 -0.58 , and 10 -0.62 correspondingly. Finally, on evaluating the results interms of CCDF, the experimental outcome indicated that the OFDM model has outperformed insignificant performance over all the other techniques by attaining higher CCDF. At the same time, the with-GA algorithm has accomplished slightly reduced CCDF over the OFDM approach whereas even improved CCDF has been obtained by the with-FSO algorithm. However, the proposed HPT-SSAE method has resulted in effective performance and achieved a minimal CCDF. The HPT-SSAE approach has reached the least CCDF of 5.3 dB whereas the OFDM, with-GA, and with-FSO techniques have showcased a higher CCDF of 11 dB, 5.8 dB, and 7 dB correspondingly.

Conclusion
For addressing the PAPR problem in OFDM systems, this paper has introduced a new HPT-SSAE model for PAPR reduction. The HPT-SSAE model is intended for the substantial reduction in the peaks in the OFDM signal with no degradation of the BER. The presented HPT-SSAE model is exploited to create a peak canceling signal dynamically depending upon the features of the input signal. In the HPT-SSAE model, the constellation mapping and demapping of symbol take place on every individual subcarrier dynamically using the SSAE model. In order to further enhancement in the efficiency of the SSAE model, the hyperparameters can be tuned by the use of MBO algorithm. Extensive experimental analysis of the HPT-SSAE model takes place to point out the betterment of the HPT-SSAE model. The obtained experimental outcomes pointed out that the HPT-SSAE model highlighted the superior performance over the other compared methods interms of different measures.