Open Access iconOpen Access

ARTICLE

Head-Body Guided Deep Learning Framework for Dog Breed Recognition

Noman Khan1, Afnan2, Mi Young Lee3,*, Jakyoung Min4,*

1 Department of Computer Science, Yonsei University, Seoul, 03722, Republic of Korea
2 School of Information Technology, Murdoch University, Perth, WA 6150, Australia
3 Research Department, Chung-Ang University, Seoul, 06974, Republic of Korea
4 Department of Design Innovation, Sejong University, Seoul, 05006, Republic of Korea

* Corresponding Authors: Mi Young Lee. Email: email; Jakyoung Min. Email: email

Computers, Materials & Continua 2025, 85(2), 2935-2958. https://doi.org/10.32604/cmc.2025.069058

Abstract

Fine-grained dog breed classification presents significant challenges due to subtle inter-class differences, pose variations, and intra-class diversity. To address these complexities and limitations of traditional handcrafted approaches, a novel and efficient two-stage Deep Learning (DL) framework tailored for robust fine-grained classification is proposed. In the first stage, a lightweight object detector, YOLO v8N (You Only Look Once Version 8 Nano), is fine-tuned to localize both the head and full body of the dog from each image. In the second stage, a dual-stream Vision Transformer (ViT) architecture independently processes the detected head and body regions, enabling the extraction of region-specific, complementary features. This dual-path approach improves feature discriminability by capturing localized cues that are vital for distinguishing visually similar breeds. The proposed framework introduces several key innovations: (1) a modular and lightweight head–body detection pipeline that balances accuracy with computational efficiency, (2) a region-aware ViT model that leverages spatial attention for enhanced fine-grained recognition, and (3) a training scheme incorporating advanced augmentations and structured supervision to maximize generalization. These contributions collectively enhance model performance while maintaining deployment efficiency. Extensive experiments conducted on the Tsinghua Dogs dataset validate the effectiveness of the approach. The model achieves an accuracy of 90.04%, outperforming existing State-of-the-Art (SOTA) methods across all key evaluation metrics. Furthermore, statistical significance testing confirms the robustness of the observed improvements over multiple baselines. The proposed method presents an effective solution for breed recognition tasks and shows strong potential for broader applications, including pet surveillance, veterinary diagnostics, and cross-species classification. Notably, it achieved an accuracy of 96.85% on the Oxford-IIIT Pet dataset, demonstrating its robustness across different species and breeds.

Keywords

Animal science; computer vision; dog breed; deep learning; recognition

Cite This Article

APA Style
Khan, N., Afnan, , Lee, M.Y., Min, J. (2025). Head-Body Guided Deep Learning Framework for Dog Breed Recognition. Computers, Materials & Continua, 85(2), 2935–2958. https://doi.org/10.32604/cmc.2025.069058
Vancouver Style
Khan N, Afnan , Lee MY, Min J. Head-Body Guided Deep Learning Framework for Dog Breed Recognition. Comput Mater Contin. 2025;85(2):2935–2958. https://doi.org/10.32604/cmc.2025.069058
IEEE Style
N. Khan, Afnan, M. Y. Lee, and J. Min, “Head-Body Guided Deep Learning Framework for Dog Breed Recognition,” Comput. Mater. Contin., vol. 85, no. 2, pp. 2935–2958, 2025. https://doi.org/10.32604/cmc.2025.069058



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2931

    View

  • 588

    Download

  • 0

    Like

Share Link