TY - EJOU AU - Song, Xueping AU - Yang, Jianming AU - Zhang, Shuyu AU - Zhang, Jicun TI - Research on Enhanced Contraband Dataset ACXray Based on ETL T2 - Computers, Materials \& Continua PY - 2024 VL - 79 IS - 3 SN - 1546-2226 AB - To address the shortage of public datasets for customs X-ray images of contraband and the difficulties in deploying trained models in engineering applications, a method has been proposed that employs the Extract-Transform-Load (ETL) approach to create an X-ray dataset of contraband items. Initially, X-ray scatter image data is collected and cleaned. Using Kafka message queues and the Elasticsearch (ES) distributed search engine, the data is transmitted in real-time to cloud servers. Subsequently, contraband data is annotated using a combination of neural networks and manual methods to improve annotation efficiency and implemented mean hash algorithm for quick image retrieval. The method of integrating targets with backgrounds has enhanced the X-ray contraband image data, increasing the number of positive samples. Finally, an Airport Customs X-ray dataset (ACXray) compatible with customs business scenarios has been constructed, featuring an increased number of positive contraband samples. Experimental tests using three datasets to train the Mask Region-based Convolutional Neural Network (Mask R-CNN) algorithm and tested on 400 real customs images revealed that the recognition accuracy of algorithms trained with Security Inspection X-ray (SIXray) and Occluded Prohibited Items X-ray (OPIXray) decreased by 16.3% and 15.1%, respectively, while the ACXray dataset trained algorithm’s accuracy was almost unaffected. This indicates that the ACXray dataset-trained algorithm possesses strong generalization capabilities and is more suitable for customs detection scenarios. KW - X-ray contraband; ETL; data enhancement; dataset DO - 10.32604/cmc.2024.049446