Open Access iconOpen Access

ARTICLE

A Fine-Grained Recognition Model based on Discriminative Region Localization and Efficient Second-Order Feature Encoding

Xiaorui Zhang1,2,*, Yingying Wang2, Wei Sun3, Shiyu Zhou2, Haoming Zhang4, Pengpai Wang1

1 College of Computer and Information Engineering, Nanjing Tech University, Nanjing, 211816, China
2 School of Software, Nanjing University of Information Science and Technology, Nanjing, 210044, China
3 School of Automation, Nanjing University of Information Science and Technology, Nanjing, 210044, China
4 School of Computer Science, Nanjing University of Information Science and Technology, Nanjing, 210044, China

* Corresponding Author: Xiaorui Zhang. Email: email

(This article belongs to the Special Issue: Advances in Image Recognition: Innovations, Applications, and Future Directions)

Computers, Materials & Continua 2026, 87(1), 37 https://doi.org/10.32604/cmc.2025.072626

Abstract

Discriminative region localization and efficient feature encoding are crucial for fine-grained object recognition. However, existing data augmentation methods struggle to accurately locate discriminative regions in complex backgrounds, small target objects, and limited training data, leading to poor recognition. Fine-grained images exhibit “small inter-class differences,” and while second-order feature encoding enhances discrimination, it often requires dual Convolutional Neural Networks (CNN), increasing training time and complexity. This study proposes a model integrating discriminative region localization and efficient second-order feature encoding. By ranking feature map channels via a fully connected layer, it selects high-importance channels to generate an enhanced map, accurately locating discriminative regions. Cropping and erasing augmentations further refine recognition. To improve efficiency, a novel second-order feature encoding module generates an attention map from the fourth convolutional group of Residual Network 50 layers (ResNet-50) and multiplies it with features from the fifth group, producing second-order features while reducing dimensionality and training time. Experiments on Caltech-University of California, San Diego Birds-200-2011 (CUB-200-2011), Stanford Car, and Fine-Grained Visual Classification of Aircraft (FGVC Aircraft) datasets show state-of-the-art accuracy of 88.9%, 94.7%, and 93.3%, respectively.

Keywords

Fine-grained recognition; feature encoding; data augmentation; second-order feature; discriminative regions

Cite This Article

APA Style
Zhang, X., Wang, Y., Sun, W., Zhou, S., Zhang, H. et al. (2026). A Fine-Grained Recognition Model based on Discriminative Region Localization and Efficient Second-Order Feature Encoding. Computers, Materials & Continua, 87(1), 37. https://doi.org/10.32604/cmc.2025.072626
Vancouver Style
Zhang X, Wang Y, Sun W, Zhou S, Zhang H, Wang P. A Fine-Grained Recognition Model based on Discriminative Region Localization and Efficient Second-Order Feature Encoding. Comput Mater Contin. 2026;87(1):37. https://doi.org/10.32604/cmc.2025.072626
IEEE Style
X. Zhang, Y. Wang, W. Sun, S. Zhou, H. Zhang, and P. Wang, “A Fine-Grained Recognition Model based on Discriminative Region Localization and Efficient Second-Order Feature Encoding,” Comput. Mater. Contin., vol. 87, no. 1, pp. 37, 2026. https://doi.org/10.32604/cmc.2025.072626



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 339

    View

  • 53

    Download

  • 0

    Like

Share Link