Data Traffic Reduction with Compressed Sensing in an AIoT System

: To provide Artificial Intelligence (AI) services such as object detection, Internet of Things (IoT) sensor devices should be able to send a large amount of data such as images and videos. However, this inevitably causes IoT networks to be severely overloaded. In this paper, therefore, we propose a novel oneM2M-compliant Artificial Intelligence of Things (AIoT) system for reducing overall data traffic and offering object detection. It consists of some IoT sensor devices with random sampling functions controlled by a compressed sensing (CS) rate, an IoT edge gateway with CS recovery and domain transform functions related to compressed sensing, and a YOLOv5 deep learning function for object detection, and an IoT server. By analyzing the effects of compressed sensing on data traffic reduction in terms of data rate per IoT sensor device, we showed that the proposed AIoT system can reduce the overall data traffic by changing compressed sensing rates of random sampling functions in IoT sensor devices. In addition, we analyzed the effects of the compressed sensing on YOLOv5 object detection in terms of performance metrics such as recall, precision, mAP50, and mAP, and found that recall slightly decreases but precision remains almost constant even though the compressed sensing rate decreases and that mAP50 and mAP are gradually degraded according to the decreased compressed sensing rate. Consequently, if proper compressed sensing rates are chosen, the proposed AIoT system will reduce the overall data traffic without significant performance degradation of YOLOv5.


Introduction
Internet of Things (IoT) is a dynamic global network infrastructure with self-configuration capabilities based on standards and interoperable communication protocols, in which the physical and virtual things have their own identities and properties and are integrated into IoT networks through various wired and wireless interfaces [1]. The IoT networks mean interconnected world-wide networks based on sensory, communication, networking, and information processing technologies. Modern wireless technologies have extended the sensory capabilities of IoT devices and expanded the range of the IoT network significantly. Recently, some emerging technologies such as artificial intelligence (AI), edge computing, and compressed sensing are being applied to IoT to meet the needs of users and provide specific services [2,3]. Artificial Intelligence of Things (AIoT) is a new technology enabling the IoT sensor devices to analyze their sensing data, make decisions and act on the decisions without human involvement [2]. To incorporate compressed sensing (CS), often called compressive sensing, in IoT applications, some pieces of literature were reviewed in [4].
The IoT networks includes wireless local area networks (WLANs) and low power wide area networks (LPWANs). Wireless fidelity (WiFi) is one of WLANs following the IEEE 802.11 standards and it is one of widespread access networks for providing wireless connectivity to IoT devices [5]. It has infrastructure and ad-hoc modes provided by all versions of the WiFi standard family. IEEE 802.11n and 802.11ac can offer a maximum data rate of 600 Mbps and 7 Gbps, respectively. Next, Long-term evolution machine (LTE-M) and narrowband IoT (NB-IoT) as LPWANs have been presented since 3GPP Release 12 and Release 13 to support massive machine-type communications (mMTC) [6]. The requirements for mMTC are almost identical to those for LTE-M and NB-IoT [7]. A low-complexity User Equipment (UE) Category M1 as LTE-M in Release 13 was presented to enable low-cost devices, extended discontinuous reception cycles for reduced power consumption, and coverage enhancement mode operation. NB-IoT was also introduced to offer the flexibility of deployment by allowing the user of a small portion of the available spectrum in the LTE bands and coexisting with LTE and global system for mobile communication (GSM) in licensed frequency bands. Note that LTE-M supports the peak data rate of 1 Mbps for both downlink and uplink, and NB-IoT supports the data rate of 200 kbps for downlink and 20 kbps for uplink [8]. The IoT edge gateway plays an important role in heterogeneous IoT networks [9], whose main function is to forward the data from IoT sensor devices to a destination node, namely an IoT server through existing wireless communication protocols such as WiFi, LTE-M, NB-IoT, ZigBee, Bluetooth, etc. In recent years, IoT sensor devices have been forced to send a large amount of data such as images, videos, and voices for providing AI services such as object detection using deep learning in IoT applications [10]. Inevitably, IoT networks are facing severe data traffic overload problems. For Instance, it may cause time delay or latency due to limited bandwidth and unstable channel conditions (e.g., congestion, collisions, and interference) and leads to delayed decision-making for time-sensitive operations. Moreover, the centralized IoT server becomes inefficient and expensive for storing and processing a large amount of data from various types of IoT sensor devices. Jung et al. [3] proposed an oneM2M-compliant AIoT monitoring system where an AIoT edge device extracted video frame images from a CCTV camera in a pig house, detected multiple pigs in the images by a faster region-based convolutional neural network model, and tracked them by an object center-point tracking algorithm. However, they did not consider the data traffic problem. Although Djelouat et al. [4] reviewed CS-based IoT applications and highlighted emerging trends and identified avenues for future CS-based IoT research, they did not present any CS-based IoT system model and its experimental results. In [11], the authors proposed a gradient CS method for image data reduction in unmanned aerial vehicles (UAVs) where a surveillance center reconstructed decreased amount of image pixels received from the UAVs and then performed an image processing method for suspicious objects detection. However, they did not consider an AIoT system model using object detection based on deep learning. Moreover, they did not analyze the effects of compressed sensing rate on an AIoT system model in detail. Therefore, in this paper, we propose an AIoT system model using compressed sensing to solve the data traffic problem that occurs when transmitting a large amount of data through the IoT network for an AI service such as YOLOv5 object detection. Our main contributions are as follows: First, we suggest a practical AIoT system architecture for data traffic reduction between IoT sensor devices and an IoT edge gateway through separating a random sampling matrix, a CS recovery, and a transformation matrix of compressed sensing. Second, we analyze the effects of compressed sensing on various aspects of the proposed AIoT system such as data traffic reduction and object detection performance to provide a performance reference. The rest of this paper is organized as follows. Section 2 introduces the configuration and operation of the proposed AIoT system and explains functional separation of compressed sensing for data traffic reduction and YOLOv5 object detection. Section 3 analyzes some experimental results to show the effects of the compressed sensing on YOLOv5 object detection. Finally, concluding remarks are presented in Section 4.

Proposed AIoT System Model
The proposed AIoT system model is illustrated in Fig. 1 where there are some IoT sensor devices with random sampling functions for compressed sensing, an IoT edge gateway with CS recovery and domain transform functions for compressed sensing and a YOLOv5 deep learning function for object detection, and an IoT server. The IoT sensor device extracts a large data, e.g., an M × N image f , from a high definition (HD) camera sensor and compresses the image by using a K × M random sampling matrix according to a compressed sensing rate α = K/M in order to obtain a K × N compressed image b. Then it uploads the compressed image b and interact with the IoT edge middleware (MW) [12] via the IoT client application modeled as an application dedicated node-application entity (ADN-AE) in oneM2M specifications [13]. In the IoT edge gateway, the IoT edge MW modeled as a middle node-common service entity (MN-CSE) in [13] temporarily stores the compressed image b and feeds it to the CS recovery. The CS recovery produces the sparse transform domain representationx of the original image f through L 1 minimization. The domain transform matrix produces the recovered imagef by inverse transformingx. In addition, YOLOv5 performs object detection in the recovered imagef and forwards information for object detection results to the IoT edge MW. The IoT edge MW will store and send the information to the IoT server MW modeled as an infrastructure node-common service entity (IN-CSE) in [13]. Note that all the IoT client application, the IoT edge MW and the IoT server MW were implemented based on an oneM2M-compliant IoT device and server platforms called nCube and Mobius [14].

Compressed Sensing for Data Traffic Reduction
Compressed sensing is known as a signal sampling framework for improving sampling efficiency by sampling sparse signals at a rate much lower than the Nyquist rate [15,16]. It accurately recovers the original high-dimensional sparse signal from low-dimensional measurement vectors with high probability by solving an optimization problem with a scarcity attribute. For random sampling in compressed sensing, single pixel, Gaussian, Bernoulli, and sparse random matrices are usually used and among them we choose a random single pixel matrix [17]. In addition, there are a lot of optimization methods for CS recovery such as L 1 minimization, L 2 minimization, least absolute shrinkage and selection operator (LASSO), restricted isometry property (RIP), etc. but we consider the basic L 1 minimization method [18]. Then, we will review the random sampling matrix and the L 1 minimization method for compressed sensing in our AIoT system model and numerically analyze the effects of compressed sensing on data traffic reduction.
As mentioned before, each IoT sensor device performs the compressed sensing that can be given as where is the random sampling matrix randomly selecting the pixels of the original image f according to the compressed sensing rate α. Note that this can significantly reduce the data traffic because it transmits the compressed image b instead of the original image f . For instance, if the compressed sensing rate α = 0.5, we only transmit a 320 × 320 compressed image b compared with an original 640 × 640 image f . The IoT edge gateway performs the CS recovery in order to find the optimal sparse transform domain representationx of the original image f by using the L 1 minimization method which is represented aŝ where A ≡ has both sampling ( ) and transformation ( ) functions and x 1 = n k=0 |x k | denotes the L 1 norm. The 2-dimensional (2D) inverse discrete cosine transform (IDCT) matrix transforms the image from the spectral domain (x) to the temporal domain (f ) that can be represented as follows.
where the entries off is defined as where 0 ≤ m ≤ M − 1 and 0 ≤ n ≤ N − 1. Also, α p and α q are represented by In Fig. 2, we explain the effects of compressed sensing on data traffic reduction in terms of data rate per IoT sensor device and number of IoT sensor devices through simple analytical results. We assume that an IoT network between IoT sensor devices and an IoT edge gateway is IEEE 802.11n WLAN whose maximum data rate is 600 megabits per second (Mbps). If the size of the original image or frame is 640 × 640 pixels and the frame rate is 30 frames per second (fps) for an HD camera sensor, the data rate per IoT sensor device without compressed sensing will be about 30 × 640 × 640 × 8 = 98 Mbps. In addition, we assume that the same compressed sensing rare α is used for all IoT sensor devices. When the compressed sensing rate α is 100, only six IoT sensor devices can transmit their data at the same time over the IoT network. If the compressed sensing rate α is 90, the data rate for each IoT sensor device becomes 88.2 Mbps and the overall data traffic is consequently reduced from 600 Mbps to about 541 Mbps. In case of the compressed sensing rate α = 50, the data rate for each IoT sensor device is 49 Mbps and the overall data traffic is 306 Mbps. In other words, we are able to increase the number of IoT sensor devices from six to twelve under the assumption that the overall data traffic is 600 Mbps. As the compressed sensing rate α decreases, the overall data traffic decreases or the number of IoT sensor devices increases. Accordingly, however, the performance of the CS recovery will decrease and then the performance of object detection based on the images resulting from the CS recovery will decrease as well. Therefore, for compressed sensing to be useful to reduce data traffic, we need to ultimately analyze the performance of object detection that will be covered in Section 3.

Object Detection with YOLOv5
We consider the YOLOv5 model, the latest release of the YOLO series for object detection in the IoT edge gateway. Because the YOLO series integrated target area prediction and category determination into a single neural network model [19][20][21], it can solve a speed problem critical to implementation of the IoT edge gateway. Especially, the execution speed of YOLOv5 has been greatly improved even compared with YOLOv4. Also, YOLOv5 is about 88% smaller than YOLOv4. Thus, it is suitable to be installed on the IoT edge gateway and it has higher accuracy and better ability to detect small objects.
In Fig. 3, YOLOv5 uses a backbone architecture, referred to CSPDarknet, for feature extraction by incorporating cross stage partial network (CSPNet) into Darknet. The CSPNet partitions a feature map of the base layer into two parts and then merging them through a cross-stage hierarchy to make the gradient flow propagate through different network paths. By integrating the gradient changes into the feature map, it decreases model parameters and floating point operations per second, thereby reducing model size as well as ensuring inference speed and accuracy [22]. Next, YOLOv5 exploits path aggregation network (PANet) for feature fusion, which adopts a feature pyramid network (FPN) structure with enhanced bottom-up path to improve the propagation of low-level features. Simultaneously, the PANet performs adaptive feature pooling which links feature grid and all feature levels for making useful information in each feature level propagate directly to the following subnetwork. This can improve the location accuracy of the object. Finally, the YOLO layer is used for producing detection results such as class, score, location, and size. It generates three feature maps with 18 × 18, 36 × 36, 72 × 72 sizes to achieve multi-scale prediction and handle small, medium, and big objects.
For training the YOLOv5 network, a binary cross-entropy with logits loss, which combines a sigmoid function and a binary cross-entropy loss in one single class [23], is utilized as the loss function given by where N is the batch size, y n andŷ n denote the ground truth and the predicted possibility of nth element of an array in the batch, respectively. σ (x) denotes the element-wise sigmoid function represented by The stochastic gradient descent (SGD) optimizer is considered for minimizing the loss function. It repeats the process of obtaining the gradient and updating the model parameters for each training sample rather than obtaining an accurate gradient using the entire training data [24][25][26]. YOLOv5 has four different versions with a small (s), a medium (m), a large (l), and an extra-large (x) model. YOLOv5s has the lowest performance but the highest frame rate, whereas YOLOv5x has the best performance but the lowest frame rate. We will use the YOLOv5x model with the best performance.

Experimental Results
We describe experimental results to evaluate the performance of the proposed AIoT system model in terms of the effects of the compressed sensing on object detection. First, we developed each IoT sensor device with a Raspberry Pi 4 Model B, where the random sampling function and IoT client application were implemented based on [14,27], respectively. Instead of real-time images from a C270 HD webcam attached on the Raspberry Pi 4, 128 images of the COCO dataset were fed into the IoT sensor device for objective performance evaluation. Next, we conducted the IoT edge gateway on a personal computer with an NVIDIA GeForce GTX 960, CUDA 10.2, cuDNN 7.6.5, in which the IoT edge middleware, CS recovery and YOLOv5x were implemented based on [14,27,28], respectively. Especially, Pytorch 1.6.0 running on Ubuntu 18.04.4 was employed as an open source machine learning framework for YOLOv5x. Finally, the IoT server was implemented but not used since it was not necessary for performance evaluation. Fig. 4 shows the detection results of YOLOv5 model for an original image f and recovered imagesf with different compressed sensing rates α. As mentioned before, assuming that the compressed sensing rate is 70, only 70% pixels randomly selected from the original image are sent to the IoT edge gateway which will produce recovered images through the YOLOv5 model following the CS recovery. Compared with the original image, recovered images have many contaminated pixels. As the compressed sensing rate decreases, contaminated pixels tend to increase. However, it seems that there is no significant difference between them in instantaneous performances of bounding box regression and multi-labeled classification of YOLOv5. We compare recall and precision performances according to compressed sensing rates in Fig. 5. Recall and precision is generally defined as [29] Recall = TP TP + FN (8) where TP is true positive, FN is false negative, and FP is false positive. Recall denotes the ratio of the number of correctly detected objects (TP) to the total number of actual objects (TP+FN), namely ground truth. It gradually decreases because the number of correctly detected objects decreases as the compressed sensing rate decreases. Precision denotes the ratio of the number of correctly detected objects (TP) to the total number of detected objects (TP+FP). It remains almost constant although the compressed sensing rate decreases, because the number of correctly detected objects (TP) and the total number of detected objects (TP+FN) decrease together. We compare mean average precision 50 (mAP50) and mAP performances according to compressed sensing rates in Fig. 6. As a performance metric in measuring the accuracy of object detection models, AP is the precision averages across all recall values between 0 and 1 at various Intersection over Union (IoU) thresholds and is interpreted as the area under the curve of the precision-recall curve. mAP corresponds to the AP averaged over all classes for IoU ratio values ranging from 0.5 to 0.95 with a step size of 0.05, and mAP50 denotes mAP at IoU = 0.5 [30]. Note that the IoU is a ratio of the intersecting area of the predicted bounding box and the ground-truth bounding box to the total area of them combined. In general, if the IoU value is more than 0.5, it is judged as being detected properly. Otherwise, it is judged as being incorrect. At the compressed sensing rate α = 100, equivalent to the original image, mAP50 is 0.691 and this serves as a reference. It becomes 0.667 at α = 90, 0.656 at α = 80, and 0.598 at α = 50. Even though the compressed sensing rate decreases, there is little degradation in mAP50 performance.
The case of mAP also shows a similar tendency for performance degradation.
By analyzing experimental results in Figs. 2 and 6 together, we can draw a meaningful conclusion. When compressed sensing rate α = 100, the value of mAP50 is the same as the value of mAP50 in [28] considered as a counterpart method without using compressed sensing. The counterpart method shows the overall data traffic (588 Mbps) and the value of mAP50 (0.691) as two reference points for performance comparison. For instance, if the compressed sensing rate α = 80, we can reduce about 19.6 Mbps data rate for each IoT sensor device of six IoT sensor devices, thus reducing overall data traffic by about 117.6 Mbps (20%) or adding one more IoT sensor device. However, from the point of view of performance degradation, the value of mAP50 is only reduced by 0.035 (5%). Assuming the compressed sensing rate α = 50, we can reduce overall data traffic by 294 Mbps (50%) or adding six more IoT sensor devices, but the value of mAP50 is reduced by 0.093 (13.5%). In this paper, we proposed an oneM2M-compliant AIoT system in which there are some IoT sensor devices with random sampling functions for compressed sensing, an IoT edge gateway with CS recovery and domain transform functions for compressed sensing, and a YOLOv5 deep learning function for object detection, and an IoT server. This AIoT system was able to reduce its overall data traffic or add more IoT sensor devices by changing compressed sensing rates of random sampling functions in IoT sensor devices. To analyze the effects of the compressed sensing on YOLOv5 object detection in the IoT edge gateway, recall and precision performances were first investigated, and we found that recall slightly decreases but precision remains almost constant even though the compressed sensing rate decreases. Furthermore, after analyzing mAP50 and mAP performances, we found that mAP50 and mAP are gradually degraded as the compressed sensing rate decreases. Therefore, if proper compressed sensing rates of IoT sensor devices are chosen, the proposed AIoT system will reduce the overall data traffic or increase the number of IoT sensor devices without significant performance degradation of YOLOv5.