Cross-Layer Design for EH Systems with Finite Buffer Constraints

Mohammed Baljon; Shailendra Mishra

doi:10.32604/cmc.2021.017509

[BACK]

Computers, Materials & Continua DOI:10.32604/cmc.2021.017509
Article

Cross-Layer Design for EH Systems with Finite Buffer Constraints

Mohammed Baljon and Shailendra Mishra*

Department of Computer Engineering, College of Computer and Information Sciences Majmaah University, Majmaah, 11952, Saudi Arabia
*Corresponding Author: Shailendra Mishra. Email: s.mishra@mu.edu.sa
Received: 01 February 2021; Accepted: 11 March 2021

Abstract: Energy harvesting (EH) technology in wireless communication is a promising approach to extend the lifetime of future wireless networks. A cross-layer optimal adaptation policy for a point-to-point energy harvesting (EH) wireless communication system with finite buffer constraints over a Rayleigh fading channel based on a Semi-Markov Decision Process (SMDP) is investigated. Most adaptation strategies in the literature are based on channel-dependent adaptation. However, besides considering the channel, the state of the energy capacitor and the data buffer are also involved when proposing a dynamic modulation policy for EH wireless networks. Unlike the channel-dependent policy, which is a physical layer-based optimization, the proposed cross-layer dynamic modulation policy is a guarantee to meet the overflow requirements of the upper layer by maximizing the throughput while optimizing the transmission power and minimizing the dropping packets. Based on the states of the channel conditions, data buffer, and energy capacitor, the scheduler selects a particular action corresponding to the selected modulation constellation. Moreover, the packets are modulated into symbols according to the selected modulation type to be ready for transmission over the Rayleigh fading channel. Simulations are used to test the performance of the proposed cross-layer policy scheme, which shows that it significantly outperforms the physical layer channel-dependent policy scheme in terms of throughput only.

Keywords: Energy harvesting technology; cross-layer design; delay tolerant network; fading channels; resource allocation; telecommunication power management; telecommunication scheduling

1 Introduction

Recently, energy conservation has become increasingly attractive as a way to reduce the world’s energy consumption due to the soaring demand and explosive growth of wireless communications [1]. The main alternative to many problems related to energy wastage is green communication due to wireless transmissions [2]. The definition of green communication can be expressed as the practice of effectively utilizing the energy harvested from the surrounding environment by selecting energy-efficient communication technologies. Conservation of ambient energy and judicious utilization of available energy leads to improvement in overall network throughput [3]. By 2050, the number of wireless communication devices, i.e., wearable devices and wireless sensor networks, will double or triple due to the emerging Internet-of-Things (IoT) technology [4].

As a result, many research activities and considerable interest have been generated in the last decade to explore and propose efficient and economical methods for allocating energy resources. Also, green communication can reduce the emission of Carbon dioxide (CO2) and reduce the threat caused by the enormous energy consumption in wireless networks. Therefore, many countries and organizations have agreed to reduce energy consumption [5,6]. In addition to saving energy and eliminating CO2 emissions, green communication can maximize the lifetime of wireless communication tasks due to its renewability. The operation of traditional communication systems cannot exceed the battery size or control the power supply constraints. On the other hand, EH radio nodes in communication systems can harvest energy from renewable sources in their environment and convert it into electrical energy that can be used to operate their functions. As a result, green communication with the capability EH is an effective solution to overcome the network link lifetime deficit discussed in [7,8].

Despite all the above properties of green communication represented in EH wireless networks, certain difficulties should be investigated and perhaps a new design dimension should be added. The main challenge of EH technology is the time-varying energy harvesting [9] and the scarcity of energy amount [10], which lead to the conclusion that the communication performance guarantee is difficult to fulfill. Therefore, considerable efforts have been made to improve the performance of EH wireless communication [11,12]. It is highlighted that adjusting the randomness and low rate of energy arrivals is quite crucial to develop efficient transmission policies and schemes for EH wireless networks. Due to the time-varying energy arrivals in EH technology, the transmission power needs to be adjusted even if the wireless fading channel remains unchanged, which is an additional challenge and unique feature of EH wireless networks [13].

In contrast, due to the additional metric of data buffering characteristics, the buffering delay must be considered in the queue, and resource allocation algorithms are proposed in [14,15]. Moreover, different types of delay constraints, including delay-tolerant and non-delay-tolerant views, need to be explored along with guaranteeing QoS on delay properties while proposing resource allocation schemes. A non-delay tolerant approach can be classified as a real-time application, such as real-time streaming, online gaming, and intelligent and smart assisted systems [16], which can be considered as a hard delay constraint. An example of delay-tolerant applications is traditional Internet services such as file transfer, email exchange, and web browsing, which can generally tolerate some delays in certain areas. However, a modern power-constrained wireless communication system is constrained by wireless time-varying fading channels as well as random arrival rate of traffic, which can lead to greater difficulties in ensuring the required QoS characteristics for real-time applications. Also, further limitations arise for wireless nodes that use energy harvesting technology and can therefore be referred to as Energy Harvesting Nodes (EHNs). Although EHNs are suitable for remote operation in monitored areas without human intervention, the random nature of energy harvesting technology introduces a new paradigm in resource allocation, including power allocation and scheduling. Therefore, a cross-layer dynamic modulation policy is a guarantee to meet the overflow requirements of the upper layer by maximizing throughput while optimizing transmission power and minimizing packet loss.

In this paper, we investigate the cross-layer dynamic modulation policy for energy harvesting (EH) communication system by dynamically adapting the variable power and variable rate with finite buffer constraints, including states for each channel condition, data buffers, as well as energy capacity, to guarantee that the network throughput is maximized while minimizing both the energy consumption and the number of dropped packets. Due to the natural instability of wireless time-varying fading channels and the arrival rates of data and energy, the transmission power and rate generally depend on the time-varying channel condition, the data buffer condition, and the energy capacity.

In general, both the data buffer and the energy capacity are limited by finite memory in practice. Consequently, in addition to optimizing the channel adaptive strategy, the buffers in the system must also be considered. Moreover, statistical optimization techniques cannot lead to the determination of an exact scheduling strategy due to the overlap and consideration of several elements, such as varying channel gains, the randomness of data arrivals, and the randomness of energy arrivals. Moreover, since the packet scheduling formulation is inherently dynamic, the formulation is classified according to the criterion of stochastic dynamic programming, i.e., dynamic optimization. The Markov decision process (MDP) is one of the formulas that use the criterion of dynamic optimization, a mathematical framework that analyzes system dynamics in uncertain environments. Since the decisions made using the MDP approach follow time-based characteristics, the MDP approach is not suitable for the decision epochs that have random characteristics in terms of energy and data arrival, resulting in different durations of the decision epochs.

Therefore, a wireless communication system with the EH capability is event-based in nature. Therefore, the semi-Markov decision process (SMDP) is the more appropriate approach to propose a wireless communication system with EH capability and finite buffer constraints over a wireless fading channel. In this paper, the proposed system model is formulated using the SMDP scheme to increase the throughput of the network while allocating less energy and minimizing packet dropping. To the best of our knowledge, no recent work in the open literature has studied the throughput maximization and resource allocation problem of point-to-point EH wireless communication system with finite buffer constraints over a Rayleigh fading wireless channel as an infinite horizon SMDP-based problem under data buffer and uncertainty constraints for wireless fading channels.

The main contributions of this paper are summarized as follows:

—Formulation of a novel framework for a point-to-point EH wireless communication system with finite buffer constraints on the source node over a fading channel based on an SMDP approach to maximize the network throughput by optimally allocating the harvested energy while maintaining minimum packet overflow.

—A dynamic programming technique based on SMDP is proposed to dynamically adapt the change of channel and/or buffer states, which results in optimally satisfying the physical layer requirements BER on the one hand and the data link layer overflow requirements on the other hand.

This paper is organized as follows. Section 1 discusses the introduction of Energy Harvesting (EH), Semi-Markov Decision Process (SMDP), the purpose of the work, and its importance. Section 2 discussed the related work in the field of EH wireless communication systems based on SMDP. Section 3 discusses the system model and description. The formulation of SMDP based approach is discussed in Section 4. Section 5 discusses the Adaptation Policy of the Cross-Layer Design. Results and analysis are discussed in Section 6, and the paper is concluded in Section 7.

2 Related Work

In [17], the authors proposed a resource allocation framework for a point-to-point EH wireless communication system based on the SMDP approach that maximizes the network throughput by considering only channel adaptation. Since the transmission scheduling is only channel-based, the proposed scheme provided the benchmark for the maximum performance of the physical layer under the assumption that both the data buffer and the energy buffer are infinite and the data buffer is full with stored data to be transmitted. For practical wireless networks, the adaptation of packet transmission to channel conditions along with consideration of buffer state is critical. The goal of adaptation is to stabilize system performance by providing maximum throughput while reducing the drop probability and minimizing buffer delay. The design of a wireless communication system with EH capability has generated many research activities in the field of modern wireless technology. The throughput maximization problem for a point-to-point EH wireless communication system over a fading channel was considered, while the authors in [18] attempted the same system model by proposing a low-complexity and optimal transmission policy called recursive geometric water filling (RGWF). Two-hop wireless cooperative transmission with EH capable nodes have been well studied recently.

In [19], an optimal transmission policy for the two-hop wireless communication system with EH capability at the relay node was proposed. The throughput maximization problem for a two-hop wireless communication system with EH capability at the source node was studied in [20] and solved with a cumulative curve algorithm. In [17], the RGWF algorithm was used to maximize the throughput of the two-hop EH system. Moreover, in [21], the authors considered ultra-dense small cell networks with EH capability on the base stations, where the resource allocation problem is studied and the joint user allocation and optimal power allocation are modeled based on mixed-integer programming. Moreover, in [22], the authors have tried to solve the problem of minimizing the outage probability of a network with mesh topology with sources’ EH capabilities.

On the other hand, numerous system models have been formulated based on the SMDP approach, such as mobile cloud computing networks, vehicular cloud computing networks, wireless networks, and cognitive vehicular networks. The authors in [23] showed how to manage the cloud resources, i.e., virtual machines, to support continuous cloud service across multiple cloud domains based on SMDP. In [24], the authors proposed a framework for shared multi-resource allocation for the same proposed system model in [23] using SMDP. The main objective of the proposed framework is to achieve an optimal multi-resource allocation decision by maximizing the total rewards while reducing the probability of service rejection and the time of service operation. In [25], the authors propose an optimized resource allocation scheme to optimize the long-term potential reward of the SMDP-based vehicular cloud computing system. The long-term expected reward of the system is derived by considering both the return and cost of the proposed system model and the changing characteristics of the resources. From the perspective of cognitive vehicle networks, the authors in [26] captured the dynamic property of vehicle user mobility and the change in availability in the cognitive band, where the shared resource allocation framework is formulated using the SMDP approach.

In [27], the authors considered a Narrowband-Internet of Things (NB-IT) edge computing system where Mobile Edge Computing (MEC) servers were deployed at NB–IoT enabled BSs. As a result, the IoT sensors can single-hop their sensed data into the MEC servers and utilize maximum computing and storage capacities. In general, the normal MDP model requires additional overhead because more information about the system states is needed to store information about previous system association actions. Also, scheduling and offloading decisions need to be made at each time point of the slot. Therefore, the Continuous-Time Markov Decision Process (CTMDP) model was used to formulate the NB-IoT system in [27] to reduce both the total power consumption of the IoT sensors and the long-term average system delay. Similarly, in [28], the authors used the CTMDP-based scheme to formulate the vehicle cloud resource allocation problem for mobile video services. In particular, the authors investigated dynamic offloading, which they claimed has a great impact on expanding the number of shareable resources, in addition to reducing the cost of communication paths. Therefore, the goal of the model was to improve the use of the iterative algorithms imposed in the SMDP scheme. Also, the authors in [29] used the SMDP-based scheme to propose a service function allocation algorithm for mobile edge cloud networks.

The problem was defined by considering a system reward and cost. The value iteration algorithm was used to obtain the maximum reward and reduce the rate of rejected requests. Also, many efforts have been made to utilize the promising technology Software-Defined Network (SDN) in IoT applications. The authors in [30] used SMDP to formulate the radio resource allocation problem to maximize the expected average reward of the proposed SDN-based IoT networks. The optimal solution was obtained by a relative value iteration algorithm in SMDP, while simulation results showed that the proposed resource allocation scheme successfully improved the long-term average system rewards compared to other similar resource allocation schemes in the literature. Moreover, an optimal power allocation for wireless sensors powered by a dedicated radio frequency energy source was formulated using the SMDP scheme for both time division multiplexing and frequency division multiplexing [31]. Simulation results showed that the proposed scheme outperformed the heuristic greedy method in the literature.

3 System Model

We consider an EH technology for a point-to-point wireless communication system over fading channels with a single EH transmitter and a single receiver. The transmitter is equipped with finite energy capacitor Kmax and finite data buffer Dmax as shown in Fig. 1a. We assume that the point-to-point transmission is represented as radio frames, where a radio frame divides into multiple time-slots.

images

Figure 1: (a) A point-to-point EH wireless communication system with EH capability in addition to finite data and energy buffer on the source node, (b) SMDP representation of the proposed problem

Let λc denote an average packet arrival rate at the transmitter data buffer assuming it follows the Poisson distribution. Moreover, let λe denote an average EH arrival rate at the transmitter energy capacitor. The protocol data unit (PDU) at the higher level is classified as packets, where each packet consists of a bunch of information bits and they are cumulated at the transmitter data buffer with finite size. In contrast, the PDU at the physical layer is classified as blocks, where each block is made up of a group of symbols. According to the states of channel condition, data buffer, and energy capacitor, the scheduler chooses a particular action u∈U, which is equivalent to the selected modulation constellation. Based on chosen modulation type, packets will be modulated into symbols for being ready for transmission over the Rayleigh fading channel. On the other hand, received symbols will be demodulated into the stream of bits, where bits’ streams are cumulated as symbols and stored at the receiver data buffer. As the last step, the received demodulated packets are delivered to the application layer through the network’s stack.

We assume that the discrete duration of time-slots represents by frames that contain Ns channels, as shown in Fig. 1b. Depending on the scheduler’s decision, the number of transmitted packets may be varied at each frame in the time-line.

Assuming wn is the number of packets that are extracted from the data buffer for purpose of transmission, Rn is the adaptive modulation rate at each transmission in the unit of bits/symbol. The relationship between the number of packets transmitted and the rate of modulation is expressed as,

wn=(NsNp)Rn,(1)

where Np is the size of packets in a unit of bits.

3.1 Channel Modeling

We consider Rayleigh fading channel that follows ergodic flat fading in our analyzed EH technology system. The probability density function (pdf) of the fading power gain for the Rayleigh channel follows exponential distribution [32].

f(γ)=1γ¯.exp(-γγ¯),forγ≥0,(2)

where γ¯ is the average power gain of the received channel.

Rayleigh fading channel is modeled as a first-order Markov model and channel states in the system are described as C={c1,c2,…,cC}. Probability transition matrix among states, on the other hand, is constituted by P=[Pci,cj,1≤i,j≤C], in which C is the number of channel states that are not overlapped, whereas Pci,cj is the transition probability between states, i.e., Pci,cj=P(cj∣ci),1≤i,j≤C. Let Γ={γ0,γ1,…,γC} describes the thresholds set of received SNR in increasing sequence, where γ0=0, γi<γi+1 and γC=∞. For example, to illustrate, the channel may consider in-state ci if γi-1≤γ≤γi. In this paper, a C-state wireless channel model is described our proposed point-to-point EH transmission model, where C-possible channel states may illustrate as c∈{c1,c2,…,cC}.

3.2 Energy and Battery Model

The transmitter is assumed to be equipped with a finite energy capacitor that can hold a maximum of K EUs. Let K={k0,k1,…,kK} denote the space of capacitor state in term of EU occupancy, where kj corresponds to j∈{0,1,…,K} EUs in the capacitor. The number of EUs in the buffer is determined dynamically based on capacitor status, energy consumption, and new harvested energy. The dynamics of the capacitor occupancy is given by,

kn+1= min{max(k0,kn-on+gn),kK}(3)

where g∈{0,1,…,G} denotes the EUs that are harvested, and o∈{0,1,…,O} represents the number of consumed energy at each time-slot for transmission purposes.

3.3 Queue Dynamics with Finite Buffer Constraint

The transmitter utilizes its data buffer to store the arrival packets. Let D={d0,d1,…,dD} represent the space of data buffer state in term of buffer occupancy and dii∈{0,1,…,D} denotes the range of stored packets in the buffer. The number of stored packets in the buffer at each decision-epoch is determined dynamically based on the current buffer state, transmitted packets, and new incoming traffic, and it can be expressed as follows,

dn+1= min{max(d0,dn-wn+fn),dD}(4)

where f∈{0,1,…,F} corresponds to the number of received packets into the data buffer whereas w∈{0,2,…,W} denotes the packets that are extracted from the data buffer for purpose of transmission. The constraints of the maximum number of a transmitted packet through the wireless transmission are the number of packets that physically exist in the data buffer as well as the instantaneous link capacity. The data buffer is assumed to be stable, and it is represented by the buffer overflow constraint:

dn-wn+fn≤dD.(5)

The equation implies that the data buffer size dD plays the main role in determining whether a strict or loose buffer overflow constraint exists. In particular, it is noticeable that a small data buffer size leads to a strict buffer overflow constraint, while a large data buffer size leads to a loose buffer overflow constraint. Since the decisions made with the MDP approach follow time-based characteristics, the MDP approach is not suitable for the decision epochs that have random characteristics in terms of energy and data arrival, which leads to different duration of the decision epochs. Therefore, a wireless communication system with the capability of EH is inherently event-based. Therefore, the semi-Markov decision process (SMDP) is a more suitable approach to propose a wireless communication system with EH capability and finite buffer constraints over a wireless fading channel.

4 SMDP Formulation of the Cross-Layer Scheduling

As discussed earlier, it is necessary to establish an approach that is suitable to account for the variability in decision epoch duration due to the variation in energy arrival as well as the arrival of data packets on the transmit capacitor or data buffer. Therefore, the time between successive control decisions varies because the decision epoch duration depends on the current states of the system as well as the action selection of the epochs, which vary inherently. On the other hand, the weight of the decision epoch cost is determined by the time it takes the system to move from one state to another. Consequently, the problem considered above is constituted as an SMDP process satisfying the dynamic nature and the required dynamic programming. The objective of our work is to implement a cross-layer scheduler for a point-to-point EH wireless network that optimally adjusts the energy allocation and transmission rate based on the physical layer (channel state) and data link layer (energy capacitor and data buffer states) such that the network throughput is maximized and packet overflow is minimized. The proposed problem can be modeled based on a semi-Markov decision process that considers the following tuple {S,As,W,Ts,P}, corresponding to system states, actions, system reward, consumption time, and transition probabilities, as explained below.

4.1 System States

To resolve the proposed dynamic programming problem, a composite system state space is structured containing the change of the channel space, information buffer state space and vitality capacitor state space. Let indicate combining elements by S=D×K×C={s1,s2,…,sS}, where sm=[di,kj,cz]; m=1,2,…,S; i=0,1,2,…,D; j=0,1,2,…,K; and l=1,2,…,C.

4.2 Set of Actions

Adaptive power allocation and modulation constellation scheme are proposed to verify an action that dynamically adapts the power/rate transmission scheme, which has a two-to-one mapping between the energy allocation and the transmission rate from one hand, and the number of transmitted packets from another hand. Depending on the instantaneous composite system state sn, the controller chooses an action un, where U={u1,…,uU} denotes a finite space of actions. Generally, a policy π that is part of a policy system space π can be constructed by π={μ1,μ2,…}, and an action un=μn(sn) at decision-epoch n may be taken at each instant. Moreover, considering the set of several allocated EUs E={e0,e1,…,eE} and the range of available transmission rates W={w0,w1,…,wW}, two mapping functions ϕ and ψ can be identified, where ϕ maps an action of several allocated EUs that is applied ϕ:U→E and ψ maps an action of selected transmission rate for transmission ψ:U→W, respectively. Assuming Pe(γ) is the instantaneous bit error rate (BER) with received SNR γ, BER expression can be found for M-QAM and it is expressed by [33];

Pe(γ)=2v(1-1M)∑i=1M2erfc((2i-1)3vγPT2(M-1)P¯)(6)

where v = log2(M) is the number of modulated bits into 2v-QAM symbol and P¯ denotes the average transmitted signal power. The instantaneous received SNR for a constant transmit power is given by γ=hP¯/σ2, where h is the power gain of the channel and σ2 is the variance of channel noise. Assuming the power of the transmission is denoting as PT, the instantaneous received SNR at interval n is determined by γPT/P¯. Two adaptation policies are considered to examine the implementation of the proposed cross-layer wireless communication system with EH constraints:

4.2.1 Channel-Dependent Static Policy

Adaptive modulation rate is selected based on the channel condition status only but it maintains a fixed specified BER. However, this adaptation is not implementable in practice because it does not consider the finiteness of the data buffer and consequently the overflow equipment.

4.2.2 Dynamic Joint the Finiteness Buffer and Channel-Dependent Policy

The SMDP process is constituted to firmly formulate the dynamic Joint adaptation both of finiteness buffer as well as the channel-dependent state. While the proposed policy considers both buffer states and channel state, the scheduler/controller determines the optimum action for each state that maximizes the long-run system reward. The proposed policy satisfies the system requirements in maximizing the system reward while ensuring minimum energy consumption and packet overflow. The combination of energy allocation and transmission rate is set by X=E×W={x0,x1,…,xU}={(e0,w0),(e0,w1),…,(eE,wW)}.

4.3 Transition Matrix

The probability of transition from a single state s = sq to another state s′ = sr for a particular action is determined by transition probability, which is denoted by P(s′∣s,u). At each particular action u = ui, the transition matrix can be formulated using Kronecker product of channel transition, energy buffer, and data buffer matrices, where all are independent.

Ps(ui)=Pd(ui)⊗Pc(ui)⊗Pk(ui)=Psq,sr(ui)Psq,sr(ui)…Psq,sr(ui)Psq,sr(ui)Psq,sr(ui)⋯Psq,sr(ui)⋮⋮⋱⋮Psq,sr(ui)Psq,sr(ui)⋯Psq,sr(ui)(7)

System state transition probability from state s=sq=[di,cl,kj] to state s′sr=[dx,cy,kz] for action u = ui can be given by,

Psq,sr(ui)=Pdj,dxPcl,cyPkj,kz(8)

4.4 Reward Model

The choice for action in a state is selected by associated costs. the controller chooses the action that results in the maximum reward. A cost function Q(si,uj) constitutes the relationship between the state-action pair (si,uj) and the system reward. System reward r(s,a) (also called associated cost) at each pair of system state and corresponded action is given by,

r(s,a)=n(s,a)-g(s,a)(9)

n(s,a) denotes the instant income and cost of the system when a specified action is taken a(s) at a particular state s. We describe these objective functions as follows.

4.4.1 Adaptive Modulation Rate

It is equivalent to the immediate system reward for state-action pair (s,a) and is described as modulation constellation set QE(s,a)=[no transmission, QPSK, 16QAM 64QAM=[0,2,4,6] bits/symbol, which is the number of packets that are token from the data buffer for transmission.

4.4.2 Buffer Overflow Cost

During the buffer is at full state, the probability of dropped packets is high. The immediate overflow cost is the number of packets that are dropped from the buffer and it can be expressed as QO(s,a)=(dn-wn+fn-dD)+, where (z)+= max{0,z}.

The system expected cost g(s,a), on the other hand, can be described as follows:

g(s,a)=c(s,a)τ(s,a),(10)

where τ(s,a) denotes the service time, and c(s,a) indicates the power consumption cost that is considered by choosing a certain action uj at a certain channel state ci, shown as;

c(s,a)=PT.(11)

where the power cost c(s,a)=PT can be found using (6) by replacing the instantaneously received SNR γ into average received SNR γ¯ on the given equation: γ¯=1πjc∫γj-1γjfΓ(γ)dγ.

4.5 Sojourn Time

After choosing an action, the normal average estimated time τ(s,a) is the length of the taken time from the current event to other occurrences. Consequently, the normal average rate of an occurring event γ(s,a) Is the summation of the rates of all element processes from one state to another after an action a(s) is selected. Computation of γ(s,a) and τ(s,a) is expressed as:

γ(s,a)=τ(s,a)-1={λc+λe,ẽ∈{F,G},a=-1,λc+λe+Ri,l,ẽ∈{Cl},a=i,i∈{0,1,…,I}.(12)

where Ri,l is the modulation rate that is adapted by occupying i EU when the channel is at state l. In case of harvesting new EUs (ẽ∈{F}) or arriving new packets at the transmitter’s data buffer (ẽ∈{G}), no action is taken and no continuing processing service is on run. Once the channel state is changed (ẽ∈{Cl}), the scheduler determines the system state and then taken action consequently. The expected instant reward r(s,a) for time period τ(s,a) is determined based on the discounted reward model that is shown at [34], as below:

r(s,a)=QT(s,a)-c(s,a)Esa{∫0τe-αtdt}=QT(s,a)-c(s,a)Esa{[1-e-ατ]α}=QT(s,a)-c(s,a)α+γ(s,a),(13)

where QT(s,a)=[QE(s,a)-QO(s,a)] and α is a continuous-time discounting factor. Relying on the transition probabilities at (7) and also the reward model at Eq. (13), we can formulate the maximal discounted long-term reward of the state s based on Bellman equation which described the discount reward model as follows:

v(s)= maxa∈A[r(s,a)+λ∑s′∈Sp(s′∣s,a)v(s′)],(14)

where λ=γ(s,a)α+γ(s,a)<1.

5 Adaptation Policy of the Cross-Layer Design

The policy of the cross-layer adaptation scheme takes into account the energy capacitor and data buffer occupancies as well as the channel state to target the overflow cost. For example, the transmitter requires different transmit powers at different channel states on time-varying channels. However, the sender could also transmit at a higher rate to avoid packet congestion when the data buffer is full, so to speak, or when the average data arrival rate is high and vice versa. In this section, we show how to optimally adjust the modulation rate for cross-layer EH networks using the SMDP approach. It is based on the iteration approach discussed in [35]. Can obtain an optimal policy as described in Algorithm 1.

images

Initially, both v(s) and Popt(s) are launched at zero for each state s. Also, v(s) and Popt(s) are continuously determined till the rate of v(s) for each state s is equal to that of the associated v(s′) in the previous iteration, meaning that the process of converging is achieved. The overall output performance Popt(s) for all states is the system’s taking actions policy, which ends up in acquiring the maximal discounted reward.

6 Numerical Results

In this section, we show the performance of two adaptation strategies. We set our parameter values as follows: we assume that the energy extraction rate and the packet arrival rate follow a Poisson distribution with an average rate (λe=2) and (λc=3), respectively. Moreover, we assume that the finite energy capacitor Kmax = 20, finite data buffer Dmax = 20, Ns/Np = 1, and the number of channel states and actions are C = 4 and U = 4, respectively. An independent and identically distributed Rayleigh fading channel with a mean value (m¯=1) is considered. Moreover, average transmission power is set by (P¯=1mW) and the corresponding normalized average received signal to noise ratio (SNR) is valued by (γ¯=1). Also, and average channel bit error rate (BER) and modulation constellation set are assumed as (P¯e=10-4) and, w=[0,2,4,6] bits/symbol, respectively.

images

Figure 2: Relationships between the total throughputs and the overflow probability rates with the change of packet arrival rate among different schemes

The total throughput and blocking probability of the static adaptation policy on the physical layer and the dynamic adaptation policy on the other layer are compared in Fig. 2. It can be seen that the throughput of our proposed cross-layer policy scheme achieves the same performance as the benchmark scheme. However, the benchmark scheme does not track the state of the energy capacitor in each time period, since it is assumed that the energy available in each interval is infinite. Nevertheless, the average transmitted power is limited to the bounded λe. Hence, the control action may not always be feasible. Therefore, although the benchmark scheme is characterized by its low computational complexity, this scheme is not applicable in reality. Finally, the dashed curves reflect the actual average system throughput in the case of the channel-dependent static adaptation strategy. The gap between the average throughput in the case of the cross-layer dynamic strategy and the channel-dependent static strategy grows as the packet arrival rates grow. Fig. 2 also shows how with the growth of packet arrival rates in the channel-dependent static strategy, the blocking risk increases, while the blocking rate in the cross-layer dynamic strategy is minimal and approaches zero even with the increase in data arrival rate. The reason is that in the channel-dependent policy, the scheduler uses different modulation constellations based only on the channel state without tracking the capacitor and buffer states. Consequently, the policy has no guarantee of overflow requests. On the other hand, in the cross-layer policy, BER and the packet overflow requirements are guaranteed for a high data arrival rate.

Fig. 3 shows the trade-off curve between maximum throughput and maximum buffer for the cross-layer dynamic and static policies for a layer. It can be seen that the total throughput increases with the growth of the finite buffer size for both policies. However, while the throughput growth rate is high for smaller buffer sizes, the growth rate slows down as the data buffer size increases. It is also seen that the proposed cross-layer strategy achieves the same overall optimal throughput performance as the benchmark method. Moreover, it can be seen from the figures that the proposed scheme outperforms the static approach and the performance difference between them increases as the maximum data buffer size increases. It can be concluded that although the complexity of the cross-layer dynamic scheme is higher, it is still worth implementing due to its performance over the static method, especially as the data buffer size increases.

images

Figure 3: Relationships between the total throughputs and various data buffer sizes among different schemes

7 Conclusions

Energy harvesting (EH) technology in wireless communications is a promising approach to extend the lifetime of future wireless networks. Unlike most adaptation strategies in the literature, which are based only on channel-dependent adaptation at the physical layer, this paper investigates a cross-layer optimal adaptation strategy for a point-to-point energy harvesting (EH) wireless communication system with finite buffer constraints over a Rayleigh fading channel based on a Semi-Markov Decision Process (SMDP). While the channel-based transmission scheduling provides the benchmark for the maximum performance of the physical layer under the assumption that the data buffer always has data to transmit and the size of the data buffer and the energy buffer is infinite, the practical adaptation design needs to be invented to stabilize the system performance by providing the maximum throughput while reducing the drop probabilities and minimizing the buffer delay for a cross-layer design. Therefore, the SMDP framework has been applied to determine the optimal policy of a cross-layer design for a single-hop network EH based on channel-dependent static adaptation and cross-layer dynamic adaptation. In cross-layer adaptation, throughput is maximized by tracking the state of the battery, data buffer, and channel to optimally control the transmit power and rate over the transmit time intervals. Illustrating the numerical results, it is noticed that the cross-layer adaptation policy outperforms the channel-dependent policy by guaranteeing the overflow rate and hence the network throughput in a network with green communication features and EH sources. Moreover, the proposed cross-layer scheme was shown to be implementable compared to the benchmark scheme and still provides the same throughput as the benchmark scheme for all packet arrival rates and maximum buffer size. As a suggestion for future work, an optimal transmission policy based on the SMDP formulation can be applied to a cooperative wireless communication where the source and relay have energy harvesting capability, and the model is designed based on the SMDP formulation. Since the proposed model is based on a single-hop connection between the source and the destination, relays with the capability EH can help relay the information signal when there is a direct connection between the sender and the receiver (cooperative communication), saving more energy and speeding up the data transmission. Both cooperative communication and relay selection protocol can be analyzed in terms of throughput, outage probability and energy efficiency.

Acknowledgement: The authors sincerely acknowledge the support from Majmaah University, Saudi Arabia for this research.

Funding Statement: The authors would like to thank the Deanship of Scientific Research at Majmaah University for supporting this work under Project Number No - R-2021-60.

Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.

References

1. S. Zeadally, S. U. Khan and N. Chilamkurti, “Energy-efficient networking: Past, present, and future,” Journal of Supercomputing, vol. 62, no. 3, pp. 1093–1118, 2012. [Google Scholar]

2. A. Srivastava, M. S. Gupta and G. Kaur, “Energy efficient transmission trends towards future green cognitive radio networks (5GProgress, taxonomy and open challenges,” Journal of Network and Computer Applications, vol. 168, pp. 102760, 2020. https://doi.org/10.1016/J.JNCA.2020.102760. [Google Scholar]

3. W. Ye, J. Heidemann and D. Estrin, “An energy-efficient MAC protocol for wireless sensor networks,” in Proc. Twenty-First Annual Joint Conf. of the IEEE Computer and Communications Societies, New York, NY, USA, IEEE, vol. 3, pp. 1567–1576, 2002. [Google Scholar]

4. N. Sami, T. Mufti, S. S. Sohail, J. Siddiqui and D. Kumar, “Future internet of things (IoT) from cloud perspective: Aspects, applications and challenges,” in Internet of Things. Cham, Switzerland: Springer, pp. 515–532, 2020. [Google Scholar]

5. W. Vereecken, W. V. Heddeghem, D. Colle, M. Pickavet and P. Demeester, “Overall ICT footprint and green communication technologies,” in 4th Int. Sym. on Communications, Control and Signal Processing, Limassol, Cyprus, IEEE, pp. 1–6, 2010. [Google Scholar]

6. I. B. Sofi and A. Gupta, “A survey on energy efficient 5G green network with a planned multi-tier architecture,” Journal of Network and Computer Applications, vol. 118, pp. 1–28, 2018. [Google Scholar]

7. A. Kansal, J. Hsu, S. Zahedi and M. B. Srivastava, “Power management in energy harvesting sensor networks,” ACM Transactions on Embedded Computing Systems, vol. 6, no. 4, pp. 1–38, 2007. [Google Scholar]

8. V. Sharma, U. Mukherji, V. Joseph and S. Gupta, “Optimal energy management policies for energy harvesting sensor nodes,” IEEE Transactions on Wireless Communications, vol. 9, no. 4, pp. 1326–1336, 2010. [Google Scholar]

9. Q. Zhang, A. Agbossou, Z. Feng and M. Cosnier, “Solar micro-energy harvesting based on thermoelectric and latent heat effects,” Part II: Experimental Analysis, Sensors and Actuators A: Physical, vol. 163, no. 1, pp. 284–290, 2010. [Google Scholar]

10. S. Sudevalayam and P. Kulkarni, “Energy harvesting sensor nodes: Survey and implications,” IEEE Communications Surveys & Tutorials, vol. 13, no. 3, pp. 443–461, 2010. [Google Scholar]

11. J. Xu and R. Zhang, “Throughput optimal policies for energy harvesting wireless transmitters with non-ideal circuit power,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 2, pp. 322–332, 2013. [Google Scholar]

12. M. A. Antepli, E. U. Biyikoglu and H. Erkal, “Optimal packet scheduling on an energy harvesting broadcast link,” IEEE Journal on Selected Areas in Communications, vol. 29, no. 8, pp. 1721–1731, 2011. [Google Scholar]

13. O. Ozel, K. Tutuncuoglu, J. Yang, S. Ulukus and A. Yener, “Transmission with energy harvesting nodes in fading wireless channels: Optimal policies,” IEEE Journal on Selected Areas in Communications, vol. 29, no. 8, pp. 1732–1743, 2011. [Google Scholar]

14. M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting et al., “Providing quality of service over a shared wireless link,” IEEE Communication Magazine, vol. 39, no. 2, pp. 150–154, 2001. [Google Scholar]

15. Yi. Changyan and J. Cai, “A truthful mechanism for scheduling delay-constrained wireless transmissions in IoT-based healthcare networks,” IEEE Transactions on Wireless Communications, vol. 18, no. 2, pp. 912–925, 2018. [Google Scholar]

16. X. Zhang, W. Cheng and H. Zhang, “Heterogeneous statistical QoS provisioning over airborne mobile wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 36, no. 9, pp. 2139–2152, 2018. [Google Scholar]

17. B. Mohammed, L. Mushu, H. Liang and L. Zhao, “SMDP-based resource allocation for wireless networks with energy harvesting constraints,” in IEEE 86th Vehicular Technology Conf., Toronto, Canada, IEEE, pp. 1–6, 2017. [Google Scholar]

18. H. Peter, L. Zhao, S. Zhou and Z. Niu, “Recursive waterfilling for wireless links with energy harvesting transmitters,” IEEE Transactions on Vehicular Technology, vol. 63, no. 3, pp. 1232–1241, 2013. [Google Scholar]

19. X. Chen, Y. Liu, L. X. Cai, Z. Chen and D. Zhang, “Resource allocation for wireless cooperative IoT network with energy harvesting,” IEEE Transactions on Wireless Communications, vol. 19, no. 7, pp. 4879–4893, 2020. [Google Scholar]

20. A. Ortiz, T. Weber and A. Klein, “Multi-agent reinforcement learning for energy harvesting two-hop communications with a partially observable system state,” IEEE Transactions on Green Communications and Networking, vol. 5, no. 1, pp. 442–456, 2021. [Google Scholar]

21. H. Zhang, S. Huang, C. Jiang, K. Long, V. C. M. Leung et al., “Energy efficient user association and power allocation in millimeter-wave-based ultra dense networks with energy harvesting base stations,” IEEE Journal on Selected Areas in Communications, vol. 35, no. 9, pp. 1936–1947, 2017. [Google Scholar]

22. X. C. Lin, Y. Liu, T. H. Luan, X. S. Shen, J. W. Mark et al., “Sustainability analysis and resource management for wireless mesh networks with renewable energy supplies,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 2, pp. 345–355, 2014. [Google Scholar]

23. L. Hongbin, L. X. Cai, D. Huang, X. Shen and D. Peng, “An SMDP-based service model for interdomain resource allocation in mobile cloud networks,” IEEE Transactions on Vehicular Technology, vol. 61, no. 5, pp. 2222–2232, 2012. [Google Scholar]

24. L. Yanchen, M. J. Lee and Y. Zheng, “Adaptive multi-resource allocation for cloudlet-based mobile cloud computing system,” IEEE Transactions on Mobile Computing, vol. 15, no. 10, pp. 2398–2410, 2015. [Google Scholar]

25. K. Zheng, H. Meng, P. Chatzimisios, L. Lei and X. Shen, “An SMDP-based resource allocation in vehicular cloud computing systems,” IEEE Transactions on Industrial Electronics, vol. 62, no. 12, pp. 7920–7928, 2015. [Google Scholar]

26. H. Hongli, H. Shan, A. Huang and L. Sun, “Resource allocation for video streaming in heterogeneous cognitive vehicular networks,” IEEE Transactions on Vehicular Technology, vol. 65, no. 10, pp. 7917–7930, 2016. [Google Scholar]

27. L. Lei, H. Xu, X. Xiong, K. Zheng and W. Xiang, “Joint computation offloading and multiuser scheduling using approximate dynamic programming in NB-IoT edge computing system,” IEEE Internet of Things Journal, vol. 6, no. 3, pp. 5345–5362, 2019. [Google Scholar]

28. L. Hou, K. Zheng, P. Chatzimisios and Y. Feng, “A continuous-time markov decision process-based resource allocation scheme in vehicular cloud for mobile video services,” Computer Communications, vol. 118, pp. 140–147, 2018. [Google Scholar]

29. Q. Li, L. Zhao, J. Gao, H. Liang and X. Tang, “SMDP-based coordinated virtual machine allocations in cloud-fog computing systems,” IEEE Internet of Things Journal, vol. 5, no. 3, pp. 1977–1988, 2018. [Google Scholar]

30. X. Xiong, L. Hou, K. Zheng, W. Xiang, M. S. Hossain et al., “SMDP-based radio resource allocation scheme in software-defined Internet of Things networks,” IEEE Sensors Journal, vol. 16, no. 20, pp. 7304–7314, 2016. [Google Scholar]

31. G. Qi, G. Wang, R. Fan, N. Zhang, H. Jiang et al., “Optimal resource allocation in wireless powered relay networks with nonlinear energy harvesters,” IEEE Wireless Communications Letters, vol. 9, no. 3, pp. 371–375, 2019. [Google Scholar]

32. H. V. Toan and T. M. Hoang, “Outage probability analysis of decode-and-forward two-way relaying system with energy harvesting relay,” Wireless Communications and Mobile Computing, vol. 2020, pp. 1–13, 2020. https://doi.org/10.1155/2020/8886487. [Google Scholar]

33. L. Jianhua, K. B. Letaief, J. C.-I. Chuang and M. L. Liou, “M-PSK and M-QAM BER computation using signal-space concepts,” IEEE Transactions on Communications, vol. 47, no. 2, pp. 181–184, 1999. [Google Scholar]

34. S. D. Trapasiya and H. B. Soni, “Energy efficient policy selection in wireless sensor network using cross layer approach,” IET Wireless Sensor Systems, vol. 7, no. 6, pp. 191–197, 2017. [Google Scholar]

35. D. P. Bertsekas, “Approximate policy iteration: A survey and some new methods,” Journal of Control Theory and Applications, vol. 9, no. 3, pp. 310–335, 2011. [Google Scholar]

This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.