Construction and Optimization of TRNG Based Substitution Boxes for Block Encryption Algorithms

Internet of Things is an ecosystem of interconnected devices that are accessible through the internet. The recent research focuses on adding more smartness and intelligence to these edge devices. This makes them susceptible to various kinds of security threats. These edge devices rely on cryptographic techniques to encrypt the pre-processed data collected from the sensors deployed in the field. In this regard, block cipher has been one of the most reliable options through which data security is accomplished. The strength of block encryption algorithms against different attacks is dependent on its nonlinear primitive which is called Substitution Boxes. For the design of S-boxes mainly algebraic and chaos-based techniques are used but researchers also found various weaknesses in these techniques. On the other side, literature endorse the true random numbers for information security due to the reason that, true random numbers are purely non-deterministic. In this paper firstly a natural dynamical phenomenon is utilized for the generation of true random numbers based S-boxes. Secondly, a systematic literature review was conducted to know which metaheuristic optimization technique is highly adopted in the current decade for the optimization of S-boxes. Based on the outcome of Systematic Literature Review (SLR), genetic algorithm is chosen for the optimization of s-boxes. The results of our method validate that the proposed dynamic S-boxes are effective for the block ciphers. Moreover, our results showed that the proposed substitution boxes achieve better

On the other side, randomness is a fundamental aspect of many processes in nature and an indubitably valuable resource for cryptography.Researchers endorse the true random numbers in cryptography due to the fact that true random numbers are irreversible, unpredictable, and unreproducible even if their internal structure and response history is known to the adversaries.We are using the lightning strike phenomenon as an entropy source to generate true random numbers.In this research first of all, we intake the locations of lightning strikes in the row data format from the standard repository of National Aeronautics and Space Administration(NASA) which name is Lightning Detection and Ranging System (LDAR) [38].Secondly, a novel technique is proposed for the generation of TRNG based substitution boxes.Thirdly genetic algorithm applied for the optimization of newly generated S-boxes but before this phase, a systematic literature review was conducted to know which metaheuristic optimization technique is highly adopted in the last 10 years.Based on the outcome of SLR, a genetic algorithm is chosen.The remaining paper is structured as follows; sections 2 and 3 present our contribution and the proposed methodology respectively, section 4 explains the results and evaluation section, section 5 shows the conclusion section.

Contribution
The main contributions of this research are follows:  A novel method is proposed for the generation of substitution boxes of symmetric encryption algorithms based on true random numbers.


Multi population genetic algorithm is applied for optimization of substitution boxes.


A systematic literature review was conducted to know which metaheuristic optimization technique is highly adopted in the last 10 years for the optimization of cryptographic substitution boxes.


A novel technique proposed for the true random numbers generation.

Proposed Design Methodology
The proposed technique has three phase's first phase is true random bits extraction, second phase is construction of substitution boxes and the third phase is optimization of substitution boxes.These three phases are explained in the following and the proposed system architecture is depicted in Fig. 1.

True Random Bits Extraction
To calculate the difference between each location of the lightning strike with other lightning strike locations, firstly we acquire the lightning strike locations (in the row data format) from the standard repository of NASA [38].Lightning detection and the ranging system is a volumetric lightning mapping system, which stores the real-time location of the striking point.The format of the row data is dd,hh,mm,ss,ll,xx,yy, zz, where dd represents the day of the month, hh represents the hour, mm represents the minute, ss represents second, ll represents microsecond.xx and yy represent the distance in meters from site-1 to the east direction and north direction respectively where zz represents the distance in meters above the surface of the earth.Secondly, each location of the 1st lightning strike is subtracted from the other locations of the lightning strikes.For the experiment, data of the randomly picked 9613227 lightning strikes are processed from the proposed algorithm and then resultant binary stream enhanced through the Von Neumann extractor, after this, these random bits are tested through the National Institute of Standards and Technology (NIST) statistical test suite.Results of the NIST statistical test (shown in Appendix A) proved that lightning strikes are useful entropy source for the generation of true random numbers.).This algorithm takes, true random bits from the last phase and a total number of required S-boxes from the user as parameters.In the second step SboxConstruction() algorithm, further calls the RandomWalk() algorithm by passing the two-dimensional N * N space of truly random bits.On every run RandomWalk() algorithm generates a single dynamic s-box by using the 8 states random walk rule ( left = 0, left up = 1, up word = 2, right up = 3, right = 4, right down = 5, down = 6, left down = 7).SboxConstruction() algorithm is presented in the above.Randomly we plotted the two RandomWalk() runs as sample in the Fig. 2a,2b and their resultant S-boxes in the Tab.1a and 1b.For the results, the total number of ten thousand S-boxes are constructed, where the nonlinearity score of 1674 S-boxes are ≤ 99, nonlinearity score of 3277 S-boxes are between 100 to 102, nonlinearity score of 4126 S-boxes are between 103 to 104, nonlinearity score of 923 S-boxes are between 105 to 106.The nonlinearity score of our sample S-boxes 2a, 2b is 105 and 104 respectively.As sample TRNG based S-box is shown in the Appendix B.

Optimization of Substitution Box
The most important strength of true random-based construction of S-boxes is that, these S-boxes are immune against the attacks which are mentioned in the introduction section.This is our first goal which is achieved in the above sections 3.2 by using the true randomness.The 2nd goal of this research is to find out, which metaheuristic optimization technique is effective (highly adopted) in existing literature for the optimization of S-boxes (which are not based on true randomness).To discover the answer to this question, a SLR was conducted over the last 10 years.Query wise search results of SLR is mentioned in the Tab. 2 and based on the SLR recommendation, a technique is presented in the following for the optimization of newly constructed S-boxes using the genetic algorithm.Reverse S-box algorithm for the TRNG and GA based S-boxes is shown in the Appendix C. GA is based on the Charles Darwin's theory of natural selection and survival of the fittest In the standard genetic algorithm, selection, crossover, and mutation are the common processes.In our problem for the selection process, all the substitution boxes whose nonlinearity score lies in the range of 100 to 106 are acquired as the initial population.For the crossover process, we used the one-point crossover strategy in which pair of parents exchange the half part of each parent to each other and generate new offspring.Complete crossover process is represented with yellow color in the flowchart which is depicted in Fig. 1.
Widely in the literature [39][40][41][42][43][44], the nonlinearity score is the highly desirable property that's why we also chose the nonlinearity score as the fitness value.In the conventional genetic algorithm, crossover and mutation are the two independent processes but here for the substitution box generation, we combine the crossover and mutation process together.The reason is that, the substitution boxes generated by the crossover operation usually do not satisfy the bijective property due to repeated elements.So, repetition must be removed from the new offspring to achieve the bijective property and for that purpose, a simple strategy is adopted in the mutation process.In the mutation process we flipped the one bit from right side to the left side, and after flipping each bit, the corresponding number is checked within the s-box.If the corresponding number is unique, then we stop the flipping process, otherwise we flip the next bit and so on.
In the numerous cases, we achieved the bijective property but in the few cases after flipping the eight bits, we did not get the unique element of the s-box and in those cases, we simply add one to the original element from where we start the flipping process.The mutation process is represented with green color in the flowchart of Fig. 1.From the results, we observed that numerous S-boxes and their subsets are repeated therefore we used SHA-3 based searching for removal and this process is shown with blue color in the flowchart.Here, for the experiment hundred thousand S-boxes are optimized and to examine the performance, we compared the highest quality ten thousand optimized S-boxes with the ten thousand Sboxes which are already constructed through the TRNG based technique.After optimization, it can be clearly seen from Fig. 3 that GA improves the overall quality of the S-Boxes.After optimization process, there are no S-box having non-linearity of less than and equal to 99 where TRNG based S-box construction technique 1674 S-boxes in that range.Similarly, GA produce more S-boxes in the range of 105 and 106 as compared to TRNG construction method.Furthermore, GA produce 616 new S-boxes having non-linearity of 107 and 108 and these S-Boxes were not discovered in TRNG based approach.

Nonlinearity
Out of all cryptographic properties, nonlinearity is said to be the most significant.For a strong encryption scheme, the mapping between input and output in an S-box must be nonlinear.Nonlinearity can be defined as the smallest distance of Boolean function to the set of affine functions.To get the closet affine function in the Boolean truth table, the total number of bits altered needs to be determined by the nonlinearity score.In the above Fig. 3 we saw that six hundred sixteen S-boxes attains a 108 nonlinearity score and from these S-boxes, we picked randomly one S-box as a sample which is shown in Tab.1c.The Nonlinearity scores of proposed S-boxes are better or equal to the state-of-the-art.S-boxes which are shown in Tab. 3. By using Walsh spectrum, the nonlinearity of Boolean function is determined as: Where  () (φ) is defined as: Where φ is a n-bit vector andφ ∈ GF(2 n ).The dot product between x and φ is denoted by x. φ x. φ = x1 ⊕ φ1 + x2 ⊕ φ2・ + xn ⊕ φn.

Strict Avalanche Criteria (SAC)
SAC computes the number of output bits altered caused by inverting a single bit of input.To make the system more reliable, we need to deviate output vector with half probability when one input bit is inverted.To evaluate the SAC property, dependency matrix are used.For an S-box that can satisfy SAC property, all values need to be close to ideal value of 0.5 in its dependence matrix.The SAC value of our randomly picked optimized S-box is 0.501465 which satisfies the avalanche criteria.The offsets of the dependence matrix can be determined by: Where Qr,w (g) = 2 −n ∑ gw(x) ⊕ gw(x ⊕   ) xϵB n (4)   = [θr, 1 θr, 2 . . .θr, n] T  , = 0, r ≠ w or θr,w = 1, r = w

BIT Independent Criterion (BIC)
BIC is another cryptographic metric used to measure the efficiency of S-boxes.For an S-box to satisfy the BIC property, all avalanche variables should be independent pair wise for a number of avalanche vectors created by modifying a single bit of plaintext.BIC states that reversing the input bit i modifies the output bits j and k in such a way that the no dependency lies between output bits.This would tend to improve the confusion function's effectiveness.To satisfy the BIC property, the output bits must exhibits independent behavior.Therefore, efforts are being made to decrease the dependency of output bits.The correlation coefficient is used to calculate the degree of independence among avalanche variable pairs.The bit independence of the jth and kth bits of Bei is: S-box function (h) is defined as: h: {0, 1}n →{0, 1}n .BIC parameter for the S-box function is expressed as follows: Average SAC-BIC score of our optimized S-box is 0.49937.Which is almost optimal and indicates that proposed S-box fulfills the required criteria.

Linear Approximation Probability (LP)
LP is used to evaluate the security of S-box against linear cryptanalysis.S-box provides diffusion and confusion of bits through linear mapping between input and output.Maximum imbalance of an event is determined by LP.The input bit's parity given by the mask γ1 is equal to the output bit's parity given by the mask γ2.Linear approximation probability is represented as: LP = max γ1,γ2≠0 │ {xєX|x.γ1 = S(x).γ2} Where the input mask is represented by γ1 , γ2 represents the output mask.These masks are used to calculate the linear approximation probability.X denotes the set of all possible inputs and 2n is the total number of elements in the S-Box.An S-box having low linear probability represents high nonlinear mapping and have high resistance against linear cryptanalysis.The maximum optimized S-box LP value is 0.140625 which fulfills the desired criterion.

Differential Approximation Probability (DP)
Differential approximation probability means that output shall have a difference of Δy every time the input is changed by Δx.DP examines the XOR distribution between input and output bit.Variations in output can be obtained from variations in input.For resilience against differential attacks, XOR values of all outputs and inputs must have equal probability.The exclusive-OR distributions among the inputs and outputs of S-box is calculated by: Where X represents the set of all possible input values, 2n is a total number of all the elements in the S-box, Δx and Δy are the input and output differentials.S-boxes with small differentials values are strong and good at resisting differential cryptanalysis.The DP results of Sbox1 are following in the Tab. 4, and we can see that our proposed S-box fulfils the DP criteria.

Substitution Permutation based Image Encryption
In this section simple Substitution Permutation Network (SPN) used for the image encryption.Detailed steps of the SPN are mentioned bellow.Here for the substitution step optimize S-boxes of the section 3.3 used, we used the P-boxes and keys from the [57].
Step-1: Extract the Red, Green and Blue channel values from the color image frames and repeat the following steps-2 to step-4 for every channel(R or G or B).
Step-2: Perform the bitwise XOR between the plain pixel values and sub keyi to obtain the key additive pixel values.
Step-3: Replace each value of step-2 with entries of the optimized (Sboxi) values.
Step 4: Diffuse the each value of step-3 by using (P-boxi) values.
Step 6: Combine the individual R, G, B encrypted pixel values into a single frame.
The NPCR and UACI are the two frequently used tests of the image cipher to check the strength against various attacks.NPCR, UACI are used to evaluate a large number of plain images to measure the impact of pixel change on the encrypted images.We examined the image encryption results on various standard color images (Lena, pepper, nature, bird, baboon, grapes, sparrow, butterfly).Ideal image encryption algorithm must produce different results when a pixel of the image is slightly varied.NCPR is the rate of change in the number of pixels between two encrypted images obtained from two slightly different images.
To achieve maximum sensitivity in an algorithm the value of NCPR should close to 100%.UAIC measure the mean variation of pixel intensity of two encrypted images at same location.Following Tab.5 indicates the values of NPCR and UACI.Constantly our NPCR values are around to 99.63 which is the very good value.Similarly the UACI values are around 33.5 which is also the good value.

Figure 2 :
Plotted S-boxes (a) 8-steps random walk of Sbox-1 (b) 8-steps random walk of Sbox-2 4. Results and Evaluation In this section sample S-boxes of section 3.2 are evaluated through the S-box evaluation criteria which are shown in the following 4.1 to 4.5.

Figure 3 :
Figure 3: Nonlinearity score of TRNG and GA based S-box construction

of dynamic substitution boxes construction, an algorithm proposed which name is Figure 1: Proposed System Architecture
xxxxIn the first step

Table 2 :
Query wise search results

Table 4 :
DP of the S-box1

Table 5 :
NPCR and UACI resultsThe IoT-based computationally intelligent schemes are expanding due to the growing concerns of privacy leakage and security attacks in connected systems.It has added a new potential to the internet by enabling communications between objects and humans, making a smarter and more intelligent ecosystem.In this regard, block encryption algorithms have been a standout amongst the most reliable option by which data security is accomplished.The strength of block encryption algorithms against different attacks is dependent on its nonlinear primitive which is called S-boxes.The objective of this research is dynamic generation and optimization of highly secure S-boxes for the block encryption algorithms.For this purpose we used the true random numbers as the entropy source of the proposed method because true random numbers are irreversible, unpredictable, and unreproducible.The proposed method passes all the security evaluation criteria including nonlinearity, linear approximation probability, differential approximation probability, strict avalanche criterion, bit independence criterion, differential analysis, histogram analysis, correlation coefficient tests, NPCR, and UACI tests.The results of our method validate that the proposed dynamic sboxes are effective for the block encryption algorithms.In the future, we will extend this research to the design of cryptographic key generation technique for the IoT systems.