AP60: A Taxonomy-Guided Benchmark Dataset for Fine-Grained Pest Recognition with Feature-Level Confusion Analysis
Xianfeng Zhou1,2,3, Shaogang Lei1,*, Xinfeng Li2, Zhaojie Zhang2, Lijiao Jin2, Jingcheng Zhang3, Dongmei Chen3,*
1 School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou, China
2 Zhejiang Zhengyuan Geomatics Co., Ltd., Huzhou, China
3 School of Artificial Intelligence, Hangzhou Dianzi University, Hangzhou, China
* Corresponding Author: Shaogang Lei. Email:
; Dongmei Chen. Email:
Phyton-International Journal of Experimental Botany https://doi.org/10.32604/phyton.2026.080299
Received 06 February 2026; Accepted 09 May 2026; Published online 09 June 2026
Abstract
Accurate recognition of visually similar pest species remains a major challenge in agricultural vision, given that existing datasets often lack sufficient taxonomic structure, confusable categories, and quantitative analysis of class-level visual difficulty. To address these limitations, we present AP60, a taxonomy-guided benchmark dataset for fine-grained pest recognition, comprising 62,091 images from 60 pest categories and organized according to insect taxonomy. A distinctive characteristic of AP60 is the deliberate inclusion of morphologically confusable taxa, which enables more realistic evaluation of recognition models under biologically meaningful fine-grained settings. Beyond dataset construction, we introduce a feature-level confusion analysis framework to characterize the intrinsic visual structure of AP60 from two complementary aspects: intra-class consistency and inter-class overlap. Using ResNet-34 features and cosine similarity, we quantify class-wise representation similarity and relate it to downstream recognition difficulty. Benchmark evaluations were conducted under two complementary settings. In the closed-set setting, 12 supervised models achieved an average accuracy of 85.8% and an average F1-score of 85.1%, indicating that AP60 is a challenging yet stable benchmark for standard pest recognition. In the class-disjoint few-shot setting, three representative few-shot methods were evaluated on unseen pest categories, with FLoR achieving the best accuracy of 74.4% under the 5-way 5-shot protocol. These results suggest that AP60 supports both conventional supervised classification and data-efficient recognition of unseen pest categories with limited labeled samples. Further analysis shows that higher intra-class similarity is associated with better class-level accuracy, whereas lower inter-class separability is associated with increased misclassification. Validation on two additional related pest datasets shows that the same relationships remain stable after data expansion, indicating that the proposed analysis is useful not only for performance interpretation but also for identifying classes that may benefit most from targeted dataset refinement. Overall, AP60 serves as both a benchmark dataset for fine-grained pest recognition and a data-centric resource for diagnosing feature confusion in agricultural image classification.
Keywords
Agricultural pest recognition; benchmark dataset; fine-grained classification; taxonomic hierarchy; few-shot recognition; feature-level confusion analysis