Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (303)
  • Open Access

    ARTICLE

    Effective Token Masking Augmentation Using Term-Document Frequency for Language Model-Based Legal Case Classification

    Ye-Chan Park1, Mohd Asyraf Zulkifley2, Bong-Soo Sohn3, Jaesung Lee4,*

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.074141 - 10 February 2026

    Abstract Legal case classification involves the categorization of legal documents into predefined categories, which facilitates legal information retrieval and case management. However, real-world legal datasets often suffer from class imbalances due to the uneven distribution of case types across legal domains. This leads to biased model performance, in the form of high accuracy for overrepresented categories and underperformance for minority classes. To address this issue, in this study, we propose a data augmentation method that masks unimportant terms within a document selectively while preserving key terms from the perspective of the legal domain. This approach enhances More >

  • Open Access

    REVIEW

    Prompt Injection Attacks on Large Language Models: A Survey of Attack Methods, Root Causes, and Defense Strategies

    Tongcheng Geng1,#, Zhiyuan Xu2,#, Yubin Qu3,*, W. Eric Wong4

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.074081 - 10 February 2026

    Abstract Large language models (LLMs) have revolutionized AI applications across diverse domains. However, their widespread deployment has introduced critical security vulnerabilities, particularly prompt injection attacks that manipulate model behavior through malicious instructions. Following Kitchenham’s guidelines, this systematic review synthesizes 128 peer-reviewed studies from 2022 to 2025 to provide a unified understanding of this rapidly evolving threat landscape. Our findings reveal a swift progression from simple direct injections to sophisticated multimodal attacks, achieving over 90% success rates against unprotected systems. In response, defense mechanisms show varying effectiveness: input preprocessing achieves 60%–80% detection rates and advanced architectural defenses More >

  • Open Access

    ARTICLE

    Unlocking Edge Fine-Tuning: A Sample-Efficient Language-Empowered Split Fine-Tuning Framework

    Zuyi Huang1, Yue Wang1, Jia Liu2, Haodong Yi1, Lejun Ai1, Min Chen1,3,*, Salman A. AlQahtani4

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.074034 - 10 February 2026

    Abstract The personalized fine-tuning of large language models (LLMs) on edge devices is severely constrained by limited computation resources. Although split federated learning alleviates on-device burdens, its effectiveness diminishes in few-shot reasoning scenarios due to the low data efficiency of conventional supervised fine-tuning, which leads to excessive communication overhead. To address this, we propose Language-Empowered Split Fine-Tuning (LESFT), a framework that integrates split architectures with a contrastive-inspired fine-tuning paradigm. LESFT simultaneously learns from multiple logically equivalent but linguistically diverse reasoning chains, providing richer supervisory signals and improving data efficiency. This process-oriented training allows more effective reasoning More >

  • Open Access

    ARTICLE

    Enhancing Detection of AI-Generated Text: A Retrieval-Augmented Dual-Driven Defense Mechanism

    Xiaoyu Li1,2, Jie Zhang3, Wen Shi1,2,*

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.074005 - 10 February 2026

    Abstract The emergence of large language models (LLMs) has brought about revolutionary social value. However, concerns have arisen regarding the generation of deceptive content by LLMs and their potential for misuse. Consequently, a crucial research question arises: How can we differentiate between AI-generated and human-authored text? Existing detectors face some challenges, such as operating as black boxes, relying on supervised training, and being vulnerable to manipulation and misinformation. To tackle these challenges, we propose an innovative unsupervised white-box detection method that utilizes a “dual-driven verification mechanism” to achieve high-performance detection, even in the presence of obfuscated… More >

  • Open Access

    ARTICLE

    OPOR-Bench: Evaluating Large Language Models on Online Public Opinion Report Generation

    Jinzheng Yu1, Yang Xu2, Haozhen Li2, Junqi Li3, Ligu Zhu1, Hao Shen1,*, Lei Shi1,*

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.073771 - 10 February 2026

    Abstract Online Public Opinion Reports consolidate news and social media for timely crisis management by governments and enterprises. While large language models (LLMs) enable automated report generation, this specific domain lacks formal task definitions and corresponding benchmarks. To bridge this gap, we define the Automated Online Public Opinion Report Generation (OPOR-Gen) task and construct OPOR-Bench, an event-centric dataset with 463 crisis events across 108 countries (comprising 8.8 K news articles and 185 K tweets). To evaluate report quality, we propose OPOR-Eval, a novel agent-based framework that simulates human expert evaluation. Validation experiments show OPOR-Eval achieves a More >

  • Open Access

    ARTICLE

    Detection of Maliciously Disseminated Hate Speech in Spanish Using Fine-Tuning and In-Context Learning Techniques with Large Language Models

    Tomás Bernal-Beltrán1, Ronghao Pan1, José Antonio García-Díaz1, María del Pilar Salas-Zárate2, Mario Andrés Paredes-Valverde2, Rafael Valencia-García1,*

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.073629 - 10 February 2026

    Abstract The malicious dissemination of hate speech via compromised accounts, automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern. Automatically detecting such content in Spanish is challenging due to linguistic complexity and the scarcity of annotated resources. In this paper, we compare two predominant AI-based approaches for the forensic detection of malicious hate speech: (1) fine-tuning encoder-only models that have been trained in Spanish and (2) In-Context Learning techniques (Zero- and Few-Shot Learning) with large-scale language models. Our approach goes beyond binary classification, proposing a comprehensive, multidimensional evaluation that labels each… More >

  • Open Access

    ARTICLE

    Mitigating Adversarial Obfuscation in Named Entity Recognition with Robust SecureBERT Finetuning

    Nouman Ahmad1,*, Changsheng Zhang1, Uroosa Sehar2,3,4

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.073029 - 10 February 2026

    Abstract Although Named Entity Recognition (NER) in cybersecurity has historically concentrated on threat intelligence, vital security data can be found in a variety of sources, such as open-source intelligence and unprocessed tool outputs. When dealing with technical language, the coexistence of structured and unstructured data poses serious issues for traditional BERT-based techniques. We introduce a three-phase approach for improved NER in multi-source cybersecurity data that makes use of large language models (LLMs). To ensure thorough entity coverage, our method starts with an identification module that uses dynamic prompting techniques. To lessen hallucinations, the extraction module uses… More >

  • Open Access

    ARTICLE

    Syntactic and Socially Responsible Machine Translation: A POS and DEP Integrated Framework for English–Tamil

    Rama Sugavanam*, Mythili Ramu

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2026.071469 - 10 February 2026

    Abstract When performing English-to-Tamil Neural Machine Translation (NMT), end users face several challenges due to Tamil’s rich morphology, free word order, and limited annotated corpora. Although available transformer-based models offer strong baselines, they compromise syntactic awareness and the detection and management of offensive content in cluttered, noisy, and informal text. In this paper, we present POSDEP-Offense-Trans, a multi-task NMT framework that combines Part-of-Speech (POS) and Dependency Parsing (DEP) methods with a robust offensive language classification module. Our architecture enriches the Transformer encoder with syntax-aware embeddings and provides syntax-guided attention mechanisms. The architecture incorporates a structure-aware contrastive… More >

  • Open Access

    ARTICLE

    LLM-Powered Multimodal Reasoning for Fake News Detection

    Md. Ahsan Habib1, Md. Anwar Hussen Wadud2, M. F. Mridha3,*, Md. Jakir Hossen4,*

    CMC-Computers, Materials & Continua, Vol.87, No.1, 2026, DOI:10.32604/cmc.2025.070235 - 10 February 2026

    Abstract The problem of fake news detection (FND) is becoming increasingly important in the field of natural language processing (NLP) because of the rapid dissemination of misleading information on the web. Large language models (LLMs) such as GPT-4. Zero excels in natural language understanding tasks but can still struggle to distinguish between fact and fiction, particularly when applied in the wild. However, a key challenge of existing FND methods is that they only consider unimodal data (e.g., images), while more detailed multimodal data (e.g., user behaviour, temporal dynamics) is neglected, and the latter is crucial for… More >

  • Open Access

    ARTICLE

    A Novel Unified Framework for Automated Generation and Multimodal Validation of UML Diagrams

    Van-Viet Nguyen1, Huu-Khanh Nguyen2, Kim-Son Nguyen1, Thi Minh-Hue Luong1, Duc-Quang Vu1, Trung-Nghia Phung3, The-Vinh Nguyen1,*

    CMES-Computer Modeling in Engineering & Sciences, Vol.146, No.1, 2026, DOI:10.32604/cmes.2025.075442 - 29 January 2026

    Abstract It remains difficult to automate the creation and validation of Unified Modeling Language (UML) diagrams due to unstructured requirements, limited automated pipelines, and the lack of reliable evaluation methods. This study introduces a cohesive architecture that amalgamates requirement development, UML synthesis, and multimodal validation. First, LLaMA-3.2-1B-Instruct was utilized to generate user-focused requirements. Then, DeepSeek-R1-Distill-Qwen-32B applies its reasoning skills to transform these requirements into PlantUML code. Using this dual-LLM pipeline, we constructed a synthetic dataset of 11,997 UML diagrams spanning six major diagram families. Rendering analysis showed that 89.5% of the generated diagrams compile correctly, while… More >

Displaying 1-10 on page 1 of 303. Per Page