Large Language Models for Effective Detection of Algorithmically Generated Domains: A Comprehensive Review

Hamed Alqahtani; Gulshan Kumar

doi:10.32604/cmes.2025.067738

Open Access icon Open Access

REVIEW

Large Language Models for Effective Detection of Algorithmically Generated Domains: A Comprehensive Review

Hamed Alqahtani¹, Gulshan Kumar^2,*

1 College of Computer Science, Informatics and Computer Systems Department, Center of Artificial Intelligence, King Khalid University, P.O. Box 960, Abha, 62223, Saudi Arabia
2 Department of Computer Applications, Shaheed Bhagat Singh State University, Ferozepur, 152002, Punjab, India

* Corresponding Author: Gulshan Kumar. Email: email

Computer Modeling in Engineering & Sciences 2025, 144(2), 1439-1479. https://doi.org/10.32604/cmes.2025.067738

Received 11 May 2025; Accepted 29 July 2025; Issue published 31 August 2025

Abstract

Domain Generation Algorithms (DGAs) continue to pose a significant threat in modern malware infrastructures by enabling resilient and evasive communication with Command and Control (C&C) servers. Traditional detection methods—rooted in statistical heuristics, feature engineering, and shallow machine learning—struggle to adapt to the increasing sophistication, linguistic mimicry, and adversarial variability of DGA variants. The emergence of Large Language Models (LLMs) marks a transformative shift in this landscape. Leveraging deep contextual understanding, semantic generalization, and few-shot learning capabilities, LLMs such as BERT, GPT, and T5 have shown promising results in detecting both character-based and dictionary-based DGAs, including previously unseen (zero-day) variants. This paper provides a comprehensive and critical review of LLM-driven DGA detection, introducing a structured taxonomy of LLM architectures, evaluating the linguistic and behavioral properties of benchmark datasets, and comparing recent detection frameworks across accuracy, latency, robustness, and multilingual performance. We also highlight key limitations, including challenges in adversarial resilience, model interpretability, deployment scalability, and privacy risks. To address these gaps, we present a forward-looking research roadmap encompassing adversarial training, model compression, cross-lingual benchmarking, and real-time integration with SIEM/SOAR platforms. This survey aims to serve as a foundational resource for advancing the development of scalable, explainable, and operationally viable LLM-based DGA detection systems.

Keywords

Adversarial domains; cyber threat detection; domain generation algorithms; large language models; machine learning security

Cite This Article

APA Style

Alqahtani, H., Kumar, G. (2025). Large Language Models for Effective Detection of Algorithmically Generated Domains: A Comprehensive Review. Computer Modeling in Engineering & Sciences, 144(2), 1439–1479. https://doi.org/10.32604/cmes.2025.067738

Vancouver Style

Alqahtani H, Kumar G. Large Language Models for Effective Detection of Algorithmically Generated Domains: A Comprehensive Review. Comput Model Eng Sci. 2025;144(2):1439–1479. https://doi.org/10.32604/cmes.2025.067738

IEEE Style

H. Alqahtani and G. Kumar, “Large Language Models for Effective Detection of Algorithmically Generated Domains: A Comprehensive Review,” Comput. Model. Eng. Sci., vol. 144, no. 2, pp. 1439–1479, 2025. https://doi.org/10.32604/cmes.2025.067738

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Large Language Models for Effective Detection of Algorithmically Generated Domains: A Comprehensive Review

Abstract

Keywords

Cite This Article

2365

1663

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link