Diffusion Based Channel Gains Estimation in WSN Using Fractional Order Strategies

: In this study, it is proposed that the diffusion least mean square (LMS) algorithm can be improved by applying the fractional order signal processing methodologies. Application of Caputo’s fractional derivatives are considered in the optimization of cost function. It is suggested to derive a fractional order variant of the diffusion LMS algorithm. The applicability is tested for the estimation of channel parameters in a distributed environment consisting of randomly distributed sensors communicating through wireless medium. The topology of the network is selected such that a smaller number of nodes are informed. In the network, a random sleep strategy is followed to conserve the transmission power at the nodes. The proposed fractional order modified diffusionLMS algorithmsare applied in the two configurations of combine-then-adapt and adapt-then-combine. The average squared error performance of the proposed algorithms along with its traditional counterparts are evaluated for the estimation of the Rayleigh channel parameters. A mathematical proof of convergence is provided showing that the addition of the nonlinear term resulting from fractional derivatives helps adjusts the autocorrelation matrix in such a way that the spread of its eigenvalues decreases. This increases the convergence as well as the steady state response even for the larger step sizes. Experimental results are shown for different number of nodes and fractional orders. The simulation results establish that the accuracy of the proposed scheme is far better than its classical counterparts, therefore, helps better solves the channel gains estimation problem in a distributed wireless environment. The algorithm has the potential to be applied in other applications related to learning and adaptation.

databases includes RC circuit modelled with nonlinear differential order system [47], design of multi-innovation fractional LMS algorithm [48], fractional evolutionary processing [49], power signal parameter estimation [50], and design of momentum fractional LMS for Hammerstein nonlinear system identification [51]. Evaluation through numerical experimentation is performed for different number of sleep cycles, compare the results with conventional DLMS and for different fractional orders. The performance metric of average squared error is used for the assessment. The contributions of the study are: • Development of FO Diffusion Least Mean Square (FO DLMS) for application of channel estimation in distributed wireless sensor networks. • Development of FO variant of the LMS algorithm to have fast convergence with stable MSE performance for larger step size. • There will be few nodes alive/informed to conserve energy by reducing communication overhead. The random sleep strategy will be used for this purpose. • Modification for both combine-then-adapt (CTA) and adapt then-combine (ATC) configurations suggested and experimentations. • Mathematical convergence analysis for performance for the proof of concept and simulation comparison of proposed strategies will be compared with existing techniques. • Conclusion and future enhancement applications will be stated.
The paper is organized as follows. Sections 2 and 3 describe the channel estimation problem in a two nodes communications perspective. Section 4 presents introduction to fractional derivatives especially for polynomial functions. Section 5 presents a distributed wireless channel gains estimation using the DLMS algorithm and Section 6 discuss the channel gains estimation for the CTA and ATC mechanisms using fractional derivative based DLMS. In Section 7, Simulation results and computational performance is presented and finally the main concluding remarks and future work is proposed.

System Model and MSE Based Adaptive Algorithms
Designing adaptive algorithms requires that the filter weights are progressively adjusted according to an optimization criterion. This criterion is mostly based on the input environment and the resulting error. To make a reliable communication between nodes, it is necessary that channel gains be estimated properly. In a distributed environment with limited power at the nodes, early channel parameters estimation helps improve frame efficiency through small number of training or pilot symbols possible and the conservation of power through the avoidance of re-transmissions. This section is divided in two subsections including (a) two nodes channel estimation as shown in Fig. 1, and (b) channel gains estimation in a distributed wireless sensor network in which the nodes are positioned randomly. The basic principle is to send training or pilot symbols to estimate the channel, then send a block of useful data and using the already obtained channel estimates for compensation of the channel effects. We consider the transmission of Binary Phase Shift Keying (BPSK) symbols drawn from a finite alphabet set x through a channel with memory, modelled as an M th order Finite Impulse Response (FIR) filter. This is represented by the vector h lk (n) = [h lk (n), h lk (n − 1), . . . , h lk (n − M + 1)] T ; where (.) T represents the vector transpose and the subscript lk represents link between nodes l and k. The received signal is formed from the channel output, that is, M−1 i=0 h lk (i)x(k − i) and is corrupted by the additive noise. The corresponding input regression vector is represented by x lk (n) = [x lk (n), x lk (n − 1), . . . , x lk (n − N + 1)] T . In the training phase, the transmitted signal sequence x(n) is known to the receiver and is 2212CMC, 2022 convolved with the estimated channel response of the adaptive filter to produce an output denoted by y lk (n) = M−1 i=0ĥ lk (t)x(n − i). The estimator relies on an FIR filter for the estimation of the channel gains at a particular node and an adaptation and step size adjustment for progressively adjusting the FIR filter weights to ensure stability and convergence. Ideally,ĥ lk (n) = d lk (n). The estimation error is the difference between the channel and the filter outputs, that is, e(n) = d(n) −d(n), and the optimization objective is to keep the error e(n) as small as possible. are acquired in the presence of noise which is modelled as additive Gaussian random variable with zero mean and unit variance, that is, An FIR filter is used to estimate a desired signal d(n), with u(n) = [u 0 , u 1 , u 2 , . . . , u M−1 ] T as the input vector, where n is the index or sample number. It is assumed that u(n) and d(n) are obtained from infinite length random processes. The symbol y(n) = u T (n)h(n) = h T (n)u(n) represents the output of the FIR filter and e(n) is the estimation error given by the difference e(n) = d(n) − u T (n)h(n). The output of the filter is adjusted iteratively so that it matches the desired signal d(n) and the error between the desired response and output of the filter is progressively minimized. The Mean Square Error (MSE) is one of the criterions used for the design of adaptive filters and is expressed as follows: where the operator E[.] represents the statistical expected value. The MSE criterion is best suited for the performance comparison because it is mathematically tractable, physically relevant to energy and has a single global minimum (or maximum) point resulting in optimal values of the filter coefficients [46]. Particularly, in the case of absence of noise, it produces an unbiased estimation. This is characteristic of non-recursive FIR filters that have a smooth performance surface with continuous derivatives. The performance function ξ(n) is, therefore, a bowl-shaped hyper paraboloid with one minimum point. This point represents the optimal filter coefficients that are the true or desired weights for an adaptive filter to achieve optimal performance in the given application. In (1), the expectation operation applied to the squared error results in the generation of auto-correlation matrix and cross correlation vector. However, the auto and cross correlation coefficients need to be known beforehand and the method is not practically applicable. Moreover, for the Wiener Filter case, the calculation of optimal weights requires computationally complex matrix operations of multiplication and inverse. An iterative algorithm, on the other hand is more appropriate for practical use and suits the framework of an adaptive system with changing environment [46]. The basic philosophy in such cases is to minimize the MSE at each time instant n, by applying a correction term Δh(n) to the filter weights h(n) to form a new set of coefficients at time n + 1, that is: The design objective of the adaptive filters is determining the correction term such that the filter weights converge to the desired response as early as possible and adjust the weights when in the steady state of operation. The most commonly techniques using first order derivative based gradient search method include the Least Mean Squares (LMS) and the Recursive Least Squares (RLS) algorithms. The first is based on the instantaneous error which makes it attractive for real time applications but has relatively slow convergence. The second recursively approximates the autocorrelation matrix leading to a faster convergent algorithm [34,40], but it is computationally expensive. The LMS algorithm approximates the stochastic gradient steepest descent technique. The LMS updates the tap weights iteratively in the direction of the negative gradient of the squared amplitude of instantaneous error signal [41] converging to the Wiener or optimal solution. The convergence is controlled by a step size parameter μ ≤ 1 that for large values results in faster convergence but degrading performance in the steady-state behavior. The smaller values of step size results in slower convergence but better steady-state behavior performance. The suitability of the LMS algorithm for real world applications is owed to their low computational cost that do not require complex operations (e.g., squaring, averaging, differentials, matrix inversion or measurement of the correlation functions [42,46]) such as mean or correlation coefficients. For an N-tap adaptive FIR filter system, the correction term for the LMS algorithm can be expressed as: Incorporating (3) in (2), for the correction term Δh(n) = −∇J h (n), we obtain a generalized update equation for the channel estimates at each iteration for the LMS algorithm given by: As already stated, if μ is too small, the filter takes too long to converge, and if μ is too big, the filter may diverge in the steady state. The complexity of the LMS algorithm is O(N) [46] and constitutes a good choice for online computations in adaptive filtering.

Fractional Diffusion LMS Algorithm for Distributed Channel Gains Estimation
Consider the scenario of N active sensors with each one interested in estimating the unknown channel response with M-taps, that is,ĥ k (n) = ĥ k, 0 ,ĥ k, 1 ,ĥ k, 2 , . . . ,ĥ k, M−1 T . At sample index n, each sensor k independently probes h and observe its response r k (n) to the input u k (n) = u k, 0 , u k, 1 , u k, 2 , . . . , u k,M T in the presence of additive noise δ k (n) modelled as zero mean Gaus- The desired response for each sensor k is written as [11,12]: where h k (n) represents the exact or optimal weights vector of the channel. The network objective is to choose the estimated channel vectorĥ k (n) such that to minimize the cost function: Using random sleep strategy [12,23] for the channel estimation, now consider the first sleep cycle. Suppose that the node i is awake at sleep cycle instant k and it transmits its training sequence of length M which is denoted by u k, j (n). Each node j receives the output r k, j, i, m from the channel having gain h i, j where m ∈ {1, 2, . . . , M}. The received signal at a given node will be: The network nodes i ∈ S(k)\{i} can use the diffusion LMS algorithm [7] where μ is the step size and e k, i , i, m is the error and is equal to the difference of r k, i , i, m and the estimator outpoutĥ At the end of each sleep cycle instant k, the nodes that are awake diffuse their estimates to get the combined estimateh k as: whereh k i is the estimate of h at node i at the end of the sleep cycle instant k and α (k, i) satisfy i∈S(k) α (k, i) = 1. The nodes i ∈ S (k) use the combined estimateh k for estimation during the later sleep cycle instants. The coefficients α (k, i) are free parameters that are chosen by the designer and their careful selection influences the performance of the algorithm. To further refine the estimation of channel gains, a fractional order modified diffusion LMS algorithm is proposed. The FrDLMS relies on fractional derivative based term in the standard LMS. Having the estimated channel gainĥ k and the error e k, i , i, m , the cost function that a given node has to minimize is the mean square error defined as: The network objective is to minimize the cost function, that is, each node chooses the channel estimatesĥ such that, minĥ N i=1 J i (n) = E e 2 i (n) . For minimizing, taking the partial derivative of the squared error in (10) with respect toĥ such that at a given cycleĥ =ĥ which results in the second part of Eq. (3). To introduce fractional part [47][48][49][50][51], we introduce the Caputo fractional derivative of order ν such that n − 1 < ν < n of a function g(t) = t p , for p ≥ 0 as given in [14,[46][47][48][49][50][51] as: For p ≤ n − 1, the fractional derivatives of the polynomial function g(t) = t p is zero. In (12), the operator D ν corresponds to the fractional derivative and Γ denotes the gamma function and is calculated with integral function Γ (t) = ∞ 0 x t−1 e −x dx. Applying the Caputo derivative of Eq. (12) in (11) we have the partial derivatives as: and on further simplifying, we have, The symbol represents element-wise multiplication between two vectors u i, m andĥ. The update expression utilizes both integer and fractional-order derivatives together; the algorithm rely on two step-sizes for learning rate control, that is, μ 1 and μ f and the fractional order ν which is between 0 and 1. For 0 < ν < 1 and p = 1, the Γ function always operate on R+. It is worth mentioning to note that the fractional or real power of a negative number results in a complex number. To have all the real weights, we use the following identical equation h 1−ν (n) = |h| 1−ν (n)sgn(h(n)) to avoid the generation of complex numbers [18]. Here, sgn is the sign function. We modify the update Eq. (4) with the correction term Δh(n) as the sum of integer order derivative as in (11) and the fractional part as in (14), and is given as: where μ f is step size for the fractional part in the adaptation Eq. (15). This is the final equation used to estimate the channel gains in the distributed environments and the weights so obtained would be used in the fusion Eq. (9). Fig. 3 shows the effects of varying a weight w with different fractional powers while changing the weight from 0 to 1. The nonlinear behavior can be seen where the left side curves show the effects of different fractional orders when one considers the combined factor of F1 = w 2−ν /Γ (3 − ν) considering both integer and fractional derivatives for the optimization while the right side shows the effects of fractional order of only the fractional part, that is, F2 = w 1−ν /Γ (2 − ν). As opposed to the standard linear techniques, this approach helps adjust the weights according to its importance as required by partial update functions [46]. We would be considering this implementation for the two configurations of the diffusion strategies, that is, combine-then-adapt (CTA) and Adapt-then-Combine (ATC). These algorithms utilize an accumulation step [12,23] in the adaptation mechanism to fuse information obtained from the local neighborhoods. At every instant i, all the agents in the CTA strategy performs the two steps, that is, a combination step where the agent j aggregates the channel gains estimations from its neighbors to obtain the intermediate estimations of previous step i−1 and the adaptation step where agent k uses its data to update its intermediate estimations of the channel gains. The intermediate state allows information to diffuse through the network. ATC is an alternative form of the CTA diffusion strategy and is obtained by swapping the order of the combination and adaptation steps. Tab. 1 shows a summary of the fractional order diffusion algorithm implementing the ATC strategy and Tab. 2 shows the implementation of fractional order for CTA based approach. Both algorithms rely on same input and controlling parameters or data, only the adaptation parts of the algorithms are modified.
The computational complexity of the FrDLMS is a-bit higher than the DLMS, the for-  Repeat: diffuse the estimate to geth k as in (9).
For all, i ∈ S (k) , calculate the error from the estimates as:

Convergence Analysis
To get the convergence properties of the fractional LMS algorithm in (15), we convert to a different frame of reference. We define an error vector v (n) at instant n as: where, h o represents the optimum channel gains vector. The error due to these gains is given by: In terms of v (n), the relationship in Eq. (15) can further be written for a future gain at index n + 1 as: For simplicity, we put μ f = Γ (2 − ν) μ 1 and using the value of e o (n) and taking the expectation on both sides to apply the principle of orthogonality, one would arrive at the following relation: which, in simplified form can be written as: Representing the product I E |v (n) | 1−ν by F(ĥ(n), ν), the relation can be written as: The mean power of the weight difference should be a decreasing function of the number of iterations and at steady state, we have, which is only possible for −1 − μ R − F ĥ (n) , ν < 1. This results in a simplified form: In terms of maximum eigenvalue λ max , it can be written as: This equation provides the selection of step size parameter to ensure convergence of the FLMS algorithm. It can be seen in (23), (24) that the addition of nonlinear term resulting from fractional derivatives helps adjusts the autocorrelation matrix in such a way that the spread of its eigenvalues decreases and hence, the convergence improves as well as the steady state response even for the larger step sizes.

Simulation Results
A logical diagram of the proposed algorithm and its application to the estimation of channel gains in a distributed environment is depicted in Fig. 4. The simulation procedure includes the initialization of parameters related to the wireless sensor network such as the number of sensor nodes N, the length of the training sequence M, the number of time instants k in the sleep cycle and the number of nodes d awake at a time. Other controlling parameters include the step sizes μ; 1 and μ; f , fractional order v, and calculating a value of gamma function based on the given fractional order. The number of times a node is awake is kd/N, while the number of edges is kd.
First of all, the prior statistics of mean and covariance based on 20 thousand experiments is generated for the channel gains using only the path loss effect with path loss exponent of 4 [12]. This is followed by Monte Carlo loop. In the start, the shadowing effect based on normal distribution is added. After this, a random sleep strategy (RSS) is applied. The nearest neighbors (first and second) are found, and links are established after first, second and third sleep cycles. For each time event, a random sequence u drawn from independent and identically distributed gaussian process and modulated as binary phase shift keying symbols. The symbols are input to the channel. The output of the channel at a node is the desired reference for the error of the diffusion algorithm. The adaption algorithm based on the proposed methodology is applied for the estimates of the channel and is followed by the diffusion process. The estimates are obtained after the diffusion, the process repeats, and the average error is calculated.
We would be considering this implementation for the CTA in the given two configurations of the diffusion strategies as this is the most widely used one. Simulations results are presented for both the standard DLMS [12] and FrDLMS algorithms, plots are shown for the estimation errors. The performance metric for comparison is the average squared error. We estimate the channel gains vector h while considering a WSN having 20 nodes, it employs random sleep strategy with K = 10 and d = 4. The sensor positions are generated as i.i.d. with Gaussian distribution having zero mean and unit variance. The true channel gains which need to be estimated are generated with path loss and log-normal shadowing effects [33][34][35][36]. Five hundred binary phase shift keying symbols have been used for training purpose, unit variance of noise is considered for the experiments and the averaged mean squared error is obtained using 50 independent Monte Carlo runs. The performance metric of average MSE against the number sleep cycles is used for comparison purpose; three different cases with l = 1, 2, 3 are considered. In the simulations, the step size parameter is kept at 1.0 for the fastest convergence possible for the DLMS algorithm.  [12]. In the simulation, the fractional order is fixed at 0.9. It can be seen that FrDLMS performs well than DLMS, helping the WSN nodes to estimate the channel gains in small number of sleep cycles. The proposed FrDLMS algorithm has much superior performance than the DLMS algorithm. It is with almost an 8-dB gain over the DLMS algorithm. Fig. 6 shows the average squared error performance vs. the number of sleep cycles (20) for the DLMS and FrDLMS algorithms for all the three cases, that is, 'Case 1,' 'Case 2' and 'Case 3' with the step size kept at 1.0 and the fractional order is set to 0.3 to see the effects of fractional order on the convergence. Again, it can be seen clearly that the FrDLMS algorithm offer better performance than the standard DLMS algorithm for this is the network conditions in which small numbers of sleep cycles are used for the estimation of channel gains. The proposed FrDLMS has much superior performance than DLMS with almost a 6-dB gain over the DLMS algorithm for all the cases. One can observe that the convergence improves with increasing the fractional order.

Conclusions
We proposed a novel fractional order modified diffusion least mean square algorithm and applied for the estimation of channel gains in a wireless sensor network. We considered different cases of the random sleep strategy; using the average squared error as the performance metric, we found that the proposed algorithm has much superior performance than its conventional counterpart. The results were generated for different fractional orders, and it was found that that the gain is of the order of 6-8-dB when 0-dB additive noise is considered at each node. The performance in terms of convergence speed increases when we increase the fractional order. We believe that the fractional order DLMS is a useful addition in fractional order signal processing; it is open to further research and can be applied in a other applications such as distributed sensing and estimation, online machine learning, intrusion detection and target localization.
Funding Statement: The authors received no specific funding for this study.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.