Open Access iconOpen Access



Interactive Transformer for Small Object Detection

Jian Wei, Qinzhao Wang*, Zixu Zhao

Department of Weaponry and Control, Army Academy of Armored Forces, Beijing, 100071, China

* Corresponding Author: Qinzhao Wang. Email: email

Computers, Materials & Continua 2023, 77(2), 1699-1717.


The detection of large-scale objects has achieved high accuracy, but due to the low peak signal to noise ratio (PSNR), fewer distinguishing features, and ease of being occluded by the surroundings, the detection of small objects, however, does not enjoy similar success. Endeavor to solve the problem, this paper proposes an attention mechanism based on cross-Key values. Based on the traditional transformer, this paper first improves the feature processing with the convolution module, effectively maintaining the local semantic context in the middle layer, and significantly reducing the number of parameters of the model. Then, to enhance the effectiveness of the attention mask, two Key values are calculated simultaneously along Query and Value by using the method of dual-branch parallel processing, which is used to strengthen the attention acquisition mode and improve the coupling of key information. Finally, focusing on the feature maps of different channels, the multi-head attention mechanism is applied to the channel attention mask to improve the feature utilization effect of the middle layer. By comparing three small object datasets, the plug-and-play interactive transformer (IT-transformer) module designed by us effectively improves the detection results of the baseline.


Cite This Article

J. Wei, Q. Wang and Z. Zhao, "Interactive transformer for small object detection," Computers, Materials & Continua, vol. 77, no.2, pp. 1699–1717, 2023.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 218


  • 135


  • 0


Share Link