Resource Allocation and Optimization in Device-to-Device Communication 5G Networks

: The next-generation wireless networks are expected to provide higher capacity, system throughput with improved energy efficiency. One of the key technologies, to meet the demand for high-rate transmission, is device-to-device (D2D) communication which allows users who are close to com-municating directly instead of transiting through base stations, and D2D communication users to share the cellular user chain under the control of the cellular network. As a new generation of cellular network technology, D2D communication technology has the advantages of improving spectrum resource utilization and improving system throughput and has become one of the key technologies that have been widely concerned in the industry. However, due to the sharing of cellular network resources, D2D communication causes severe interference to existing cellular systems. One of the most important factors in D2D communication is the spectrum resources utilization and energy consumption which needs considerable attention from research scholars. To address these issues, this paper proposes an efficient algorithm based on the idea of particle swarm optimization. The main idea is to maximize the energy efficiency based on the overall link optimization of D2D user pairs by generating an allocation matrixof spectrum and power. The D2D users are enabled to reuse multiple cellular user’s resources by enhancing their total energy efficiency based on the quality of service constraints and the modification of location and speed in particle swarm. Such constraint also provides feasibility to solve the original fractional programming problem. Simulation results indicate that the proposed scheme effectively improved the energy efficiency and spectrum utilization as compared with other competing alternatives.


Introduction
The wireless network is moving towards high energy efficiency, better resource utilization, and capacity. To fulfill the above demands, one of the most influencing technologies of device-to-device (D2D) communication is developed. [1][2][3][4][5]. The D2D communication allows mobile devices that are close to each other in a cellular network to use direct communication for data transmission without the intervention of any base station (BS) [6][7][8][9][10][11][12]. This approach of bypassing the BS can reduce the terminal transmission power, improve the throughput and increase system spectrum efficiency. When D2D users reuse the cellular user's spectrum in the cell, it can save spectrum resources and improve spectrum utilization efficiency, but it will result in interference between D2D and cellular users [13][14][15][16][17][18]. This method is introduced into Long Term Evolution-Advanced (LTE-A) as an example of improving the spectrum efficiency of the cellular system. D2D communication refers to the communication of mobile terminals that are physically close to each other [19][20][21][22][23][24]. The data is not relayed and forwarded by the base station, but through the local direct link established. This novel technology reduces the load of the base station, also reduces the end-to-end delay, reduces power consumption, and improves spectrum utilization [25,26].
To effectively control the interference between D2D users and cellular users, scholars at home and abroad have proposed various resource allocation algorithms under hybrid cellular and D2D networks in recent years [27][28][29][30]. However, the existing research on the resource allocation algorithm of cellular and D2D hybrid networks is all about D2D resources. For example, it is stipulated in [28] that a pair of D2D users can only reuse the spectrum resources of a single cellular user, and literature [29,30] restricts and only allow a pair of D2D user to reuse the resources of cellular at most. The literature [31][32][33] restricts the above two aspects. Although the above restrictions simplify the solution of the allocation problem, the spectrum cannot be fully utilized. On the other hand, the existing algorithms are to independently study how D2D users reuse cellular user resources after the cellular user resource allocation is completed. In the actual system, as the communication scene changes in real-time, D2D users will also establish a connection. New cellular users are accessing. Separating the resource allocation of cellular users and D2D users will not achieve the best overall performance and will cause allocation delays, which will result in a decrease in network throughput.
Literature [34][35][36] applies D2D communication to relay cooperative networks. In the literature [34,35], under different D2D relay system models, the closed-form interruption probability expression of the amplifying and forwarding relay D2D network on the N-Nakagami fading channel is derived, and the optimal and suboptimal levels are derived through transmission antenna selection and power allocation. The accurate closure and interruption probability expression of the optimal antenna transmission scheme. The authors in [36] take transmission security as the goal and use random geometric modeling to propose three power transmission strategies to derive the expressions of power, safe outage probability, and safe throughput to determine the safety performance of the system. The D2D communication in the above-mentioned documents all work in the non-multiplexed mode, and the multiplexed mode can improve the spectrum utilization of the system.
In the multiplexing mode, the D2D link and the cellular user (CU) link use the same spectrum resources, which will cause interference to the CU link. The power control and resource allocation technology in radio resource management can effectively reduce this interference [37]. A large number of researches mainly focus on improving the throughput of the system through a resource allocation or power control [38][39][40]. The authors in [38] combined particle swarm algorithm and genetic algorithm to propose the PSO-GA algorithm. Under the condition of avoiding interference, the resource allocation is mapped to the position of the particle, and the throughput of the system is used as the fitness function of the particle swarm algorithm. The method obtains the resource allocation method that maximizes the system throughput. In literature [39], the author uses game theory to propose a distributed resource allocation scheme, which coordinates the interference of D2D on the user link to the CU link through pricing, and at the same time, DP reuses available resources in an effective competitive manner with other D2D user pairs. Reference [40] proposed a game theory-based scheme for allocation power. The power allocation problem is modeled as a random game, and it is proved that there is a Nash equilibrium. Unilateral changes in participants' behavior in the equilibrium point will lead to a decrease in revenue. Most of the current literature focuses on improving spectrum efficiency, and there are few literature studies on the optimization of energy efficiency. In [41], the author proposed a two-layer optimization, which converts the original fractional non-convex optimization problem into a subtractive equation optimization, and obtains the solution of the problem through an iterative method. However, the article only considers the QoS of D2D users and ignores CU service quality. The authors in [42] combine mode selection, power allocation, and channel assignment to minimize the total power as the optimization goal. The above documents all first transform the fractional planning problem into two sub-problems of power control and resource allocation and find the optimal solution of each sub-problem to obtain the sub-optimal solution of the system. This step-by-step optimization method restricts to a certain extent of the system performance. In this paper, under the condition of ensuring the QoS of CU and D2D user pairs (i.e., the minimum rate requirements of CU and D2D user pairs), to maximize the total energy efficiency of D2D, a particle swarm-based joint power control, and resource allocation algorithm is proposed to modify the position and speed of the particle swarm update method which makes it suitable for solving the original fractional programming problem.
The remaining of the paper is organized as follows. In Section 2, the system model is discussed with analytical expressions. Section 4 describes the proposed algorithm. Section 5 gives the numerical simulation results and discussion while Section 6 concludes the paper.

System Model
The D2D communication system in the multiplexing mode is shown in Fig. 1. Considering the single-cell scenario under the LTE-A system, all users are randomly distributed in the cell. There is N number of CU user sets C = {CU n |n = 1, 2, . . ., N} and M number of D2D user pairs sets D = {DP m |m = 1, 2, . . . , M}. One D2D user pair contains one sender and one receiver end. Since the resource utilization rate of the downlink is higher than that of the uplink [43], the D2D user pair multiplexing of uplink resources is considered under the control of the base station. Assuming that the number of sub-channels allocated to the uplink is equal to the number of Cellular users, the network is fully loaded, that is, all orthogonal sub-channels are occupied by the CU (each CU occupies one sub-channel). The base station (BS) knows the channel state information (CSI) and QoS requirements of all users. To control interference, it is stipulated that the channel resources of one CU can be multiplexed by one DP at most, and one DP can multiplex multiple CU resources.

Problem Description
According to Shannon's formula, the transmission rate R n of the nth CU in the system can be expressed as where P n is the transmit power of CU n ; H n is the channel gain from CU n to BS based on path loss and shadow fading [43]; ρ m,n is the channel indicator variable. If mth D2D user pair multiplexes CU n user resources (that is, the BS allocates the same subchannel to CU and D2D user pair for data transmission), then ρ m,n = 1, otherwise ρ m,n = 0. P m,n is the transmit power of mth D2D user pair multiplexing CU n resources; H D m,n is the channel gain from the mth D2D user pair transmitter to the CU n receiver; n 0 is the noise power under the influence of Gaussian white noise in the channel.
The transmission rate R m,n of mth D2D user pair multiplexing CU n resources can be expressed as where H m is the channel gain from mth D2D user pair transmitter to receiver, H C m,n is the channel gain from CU n to mth D2D user pair receiving end.
The goal of this paper is to maximize the total energy consumption of D2D user pairs under the user's quality of service and consider the circuit power consumption of the sending user and the receiving user as 2P 0 , P cir represents the total power consumption of the D2D user pair circuit.
The total energy efficiency is expressed as The energy efficiency optimization problem is described as follows: max ρ m,n ,P m,n ,P n EE (4) Subject to where R min n represents the minimum rate requirement of CU n , and P D m and P C n represent the maximum transmission power of the mth D2D user pair transmitter and CU n . Constraints (4a) and (4b) indicate that the resources of one CU can only be multiplexed by one D2D user pair and it can reuse the resources of multiple CUs. Constraints (4c) and (4d) represent the QoS requirements of CU and D2D user pairs respectively, namely the CU and D2D user rates meet the given values. The transmission power of CU and D2D user pair in (4e) and (4f) cannot exceed the specified transmission power.

Problem Transformation
To solve Eq. (4), first adjust the range of transmission power P m,n of mth D2D user multiplexing CU n resource. Reference [44] gives a detailed derivation process, so we won't repeat it here. According to Theorem 1, the rate of mth D2D user pair multiplexing CU n resources in Eq. (2) can be rewritten as Eq. (4)

Proposed PSO Scheme
The particle swarm algorithm [45,46] mainly finds the optimal solution through the collaboration and information sharing between individuals in the group. However, the two matrix variables of the optimization objective in this paper are the continuous power allocation variable P and the discrete channel allocation variable ρ. The continuous PSO algorithm cannot be directly applied to the fractional planning problem in this article. To adapt to the multi-dimensional discrete and continuous optimization problem in this article, the particle speed and position update method are modified. Assuming that the population size is I, the position of the particle is composed of P and ρ, where the position and velocity of the i-th particle are P i , ρ i and ΔP i , Δρ i respectively, and the historical best position to record the i-th particle is Pb i , ρb i . We denote the best positions found by the population are Pg and ρg.
At this time, the speed update strategy of particle i in μ + 1 iteration is where c 1 , c 2 , c 3 , c 4 represent learning factors, r 1 , r 2 , r 3 , and r 4 are random numbers on [0, 1], and ω 1 , ω 2 are the inertia weights. Since the channel allocation matrix ρ i is a discrete variable, the position update strategy of the continuous standard PSO needs to be adjusted appropriately. Reference [46] proposed a Binary PSO (BPSO) algorithm for solving discrete variables, introducing the Sigmoid function to convert continuous speed values into discrete values. The expression of the sigmoid function is as follows: The update method of resource allocation in the location is where rand is that the random function on [0, 1] obeys uniform distribution. The modified location update strategy can be applied to resource allocation problems with discrete variables.
The update method of the power in the position of particle i is Taking into account the resource allocation constraint Eq. (4b), the constraint conditions (4a) and (4b) must be met after each particle's position is updated, and the following rules are used to ensure that the resources of the CU can only be used by at most one D2D link: The D2D link reuses the same CU resource. If only one D2D link occupies other resources and the transmission rate generated is less than R min m , that is, Eq. (12) is satisfied, this resource is allocated to the link. If multiple D2D links satisfy the Eq. (12), in this group of D2D users, the resource of the CU is reused according to the maximum single energy efficiency generated by the link, that is, Eq. where (P i ) m,n is the element in the m-th row and n-th column of P i , which means that in the i-th particle, the m-th D2D user reuses the power allocation strategy of the n-th CU resources; ρ i m,k is the element in the m-th row and k-th column of ρ i , which represents the strategy for m-th D2D user to reuse the k-th CU resources in the i-th particle.
According to the constraints conditions (6a) and (6b), the particles are adjusted as follows based on [46] as follows: The value of the fitness function is the criterion for judging the position of particles. To ensure that Eq. (6c) is satisfied during the optimization process, the penalty function is introduced as the fitness function of the particle swarm algorithm in the optimization objective Eq. (6) as follows Among them: the penalty factor μ > 0.
To ensure that the particles do not deviate from the set search space during the evolution process, the update speed of the particles is limited during each iteration. Since the power value cannot be a negative number, there must be a minimum acceptable value of ΔP i as −P i , so set ΔP i ∈ −P i , 0.2 × P max m,n . According to experience, the value range of Δρ i is set to [−Δρ max , Δρ max ]. If the velocity of the particle exceeds the constraint range, the critical value is taken.
The size of ω plays a decisive role in the global and local search capabilities of the algorithm. Therefore, to prevent the algorithm from falling into the local optimum, the form of its weight function is usually as shown in Eq. (16): where ω max is the initial weight, ω min is the final weight, T is the maximum number of iterations, and t is the current number of iterations. According to [47], the inertia weight ω belongs to the best of (0.3, 1.2). The dynamic weight function introduced in this paper is expressed in Eq. (17) as follows The proposed algorithm flow is shown in Fig. 2. The best position of the output particles represents the optimal solution for resource allocation and power control that maximizes the total energy efficiency of D2D users.

Simulation Results and Analysis
In the simulation system, all users are randomly distributed in a regular hexagonal cell with a radius of 500 m, and the BS is located in the center; other parameters are set, as shown in Tab. 1. Also, the shadow fading is a normal distribution with a standard deviation of 8 dB. The simulation was run 1,000 times and the average value was taken.
To analyze the performance advantages brought by the proposed algorithm, a comparison is performed combined with the power control (PC) to achieve energy efficiency optimization goals in [12] and the joint resource allocation and power control (JRAPC) in [41]. A D2D user can only reuse one CU resource at most, and a CU resource can only be reused by one D2D user at most, and some parameters in the literature are appropriately modified to meet the system requirements. It should be noted that the algorithm complexity in [41] is O(MN 2 ), and the proposed algorithm computational complexity is O(4MN ) which is lower and efficient.

Initialize particle swarm parameters
Solve the space randomly to generate the position , and velocity Δ , Δ of a particle Use Eqs. (12) - (14) to correct , in the center of the particle Calculate the fitness value according to Eq. (15) Update individual best position , and global best position g , g

Iterations = Max
Determine the best position g , g

End
Update the particle speed Δ , Δ using Eqs.   3 shows the performance comparison between the proposed algorithm and the algorithms in references [9,44]. It can be seen that the proposed algorithm can effectively improve the energy efficiency of D2D communication compared with references [41,44]. When the number of D2D users is small, the number of shared sub-channels is large, and system resources can be fully used. The energy efficiency decreases as the minimum rate of CU increases. This is because to ensure the rate requirement of CU, the transmission power of D2D decreases, which causes the rate of D2D to decrease, and the rate of change is higher than that of power, so energy efficiency decreases. In Fig. 4, the minimum rate requirements of DP and CU are both 2 bit/s/Hz. It can be seen how the distance between D2D users affects the energy efficiency of the proposed algorithm. The energy efficiency decreases as the distance between D2D users increases. The main reason is that the path loss increases with the distance between D2D users. The distance between D2D users has a great influence on the performance of the system.   5 shows that when the minimum rate of DP is changed, the energy efficiency decreases. As the minimum rate increases, the DP that meets the quality of service constraints decreases. At the same time, to achieve a higher transmission rate, the transmission power needs to be increased, resulting in a decrease in overall energy efficiency.
As shown in Fig. 6, compared with references [41,44], because the PSO under the constraint conditions can make full use of the channel resources in the system, the channel resources occupied by the D2D users under the constraint conditions are much higher than those in the references [41,44]. The energy efficiency of the system is high. Due to the different channel conditions, the number of resources occupied by different D2D users is also different. This shows that the proposed algorithm effectively improved resource reuse utilization.

Conclusions
The energy consumption of the terminal is increasing, and the development of terminal battery technology is slow. The development of algorithms to improve D2D energy efficiency is particularly important. To improve energy efficiency and resource utilization, this paper improves the particle swarm algorithm to maximize energy efficiency while ensuring user QoS and proposes a joint power control and resource allocation algorithm. Through simulation verification, compared to the scenario where a D2D user can only reuse at most one CU resource, the algorithm proposed in this paper has significantly improved the system energy efficiency and resource utilization, providing a swarm intelligence optimization solution for system energy efficiency optimization. How to reduce the algorithm complexity of intelligent optimization and multi-cell resource allocation scenarios requires further research.