Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.078511
Special Issues
Table of Content

Open Access

ARTICLE

A Hybrid Approach for Query-Based Data Extraction Using Ensemble BERT Model with Walrus Optimization Algorithm

Poluru Eswaraiah1, Uddagiri Sirisha2,*, Shaik Abdul Nabi3, Revathi Durgam4, Pallavi Malavath5, Gilakara Muni Nagamani6
1 Department of Computer Science and Engineering (Data Science), Vignan’s Institute of Management and Technology for Women, Hyderabad, India
2 Department of Computer Science and Engineering, Prasad V Potluri Siddhartha Institute of Technology, Kanuru, India
3 Department of Computer Science and Engineering, AVN Institute of Engineering and Technology, Hyderabad, India
4 Department of Computer Science and Engineering (Data Science), AVN Institute of Engineering and Technology, Hyderabad, India
5 Department of Computer Science and Engineering (AI & ML), BVRIT Hyderabad College of Engineering for Women, Hyderabad, India
6 Department of Computer Science & Information Technology, Koneru Lakshmaiah Education Foundation Deemed to be University, Green Fields, Vaddeswaram, India
* Corresponding Author: Uddagiri Sirisha. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.078511

Received 01 January 2026; Accepted 20 April 2026; Published online 25 May 2026

Abstract

The growing volume of digital text complicates the extraction of relevant information from unstructured data. Transformer models such as BERT, ALBERT, and RoBERTa are powerful, but they may face challenges in hyperparameter optimization and adaptation to new domains. To address this issue, a hybrid ensemble BERT model is suggested, optimized using the Walrus Optimization Algorithm (WaOA). The framework applies PCA to reduce dimensionality, ontology normalization, and K-means clustering to improve semantic comprehension. Experimental results on the SQuAD 2.0 and MS MARCO datasets show that the proposed model outperforms the baseline models. WaOA (Weighted Average of Attention) can improve convergence, reduce training time, and enhance prediction accuracy. The model also improves the semantic relevance of the extracted information. Attention maps visualize the model’s focus on relevant query terms. The method enhances efficiency and cuts redundancy. It also provides a more generalized approach to different query types. The framework promotes consistent and reliable performance across different data conditions, including varying input formats and varying noise levels. It can be generalized to multilingual and domain-specific applications. Overall, the framework provides a scalable and reliable solution to real-world information extraction.

Keywords

Query-based information extraction; ensemble BERT; walrus optimization algorithm; metaheuristic learning; PCA; K-means clustering; ROUGE; t-SNE; attention visualization
  • 251

    View

  • 49

    Download

  • 0

    Like

Share Link