TY  - EJOU
AU  - Derea, Zaid 
AU  - Zou, Beiji 
AU  - Kui, Xiaoyan 
AU  - Thobhani, Alaa 
AU  - Abdussalam, Amr 

TI  - A Dual-Layer Attention Based CAPTCHA Recognition Approach with Guided Visual Attention
T2  - Computer Modeling in Engineering \& Sciences

PY  - 2025
VL  - 142
IS  - 3
SN  - 1526-1506

AB  - Enhancing website security is crucial to combat malicious activities, and CAPTCHA (Completely Automated Public Turing tests to tell Computers and Humans Apart) has become a key method to distinguish humans from bots. While text-based CAPTCHAs are designed to challenge machines while remaining human-readable, recent advances in deep learning have enabled models to recognize them with remarkable efficiency. In this regard, we propose a novel two-layer visual attention framework for CAPTCHA recognition that builds on traditional attention mechanisms by incorporating Guided Visual Attention (GVA), which sharpens focus on relevant visual features. We have specifically adapted the well-established image captioning task to address this need. Our approach utilizes the first-level attention module as guidance to the second-level attention component, incorporating two LSTM (Long Short-Term Memory) layers to enhance CAPTCHA recognition. Our extensive evaluation across four diverse datasets—Weibo, BoC (Bank of China), Gregwar, and Captcha 0.3—shows the adaptability and efficacy of our method. Our approach demonstrated impressive performance, achieving an accuracy of 96.70% for BoC and 95.92% for Webo. These results underscore the effectiveness of our method in accurately recognizing and processing CAPTCHA datasets, showcasing its robustness, reliability, and ability to handle varied challenges in CAPTCHA recognition.
KW  - Text-based CAPTCHA image recognition; guided visual attention; web security; computer vision

DO  - 10.32604/cmes.2025.059586