ADS: Adaptive Dataset Selection for Fine-Tuning in Anomalous Text

Xiaoyong Zhao¹, Jiamin Wu^2,*, Lei Wang²
1 School of Information Management, Beijing Information Science and Technology University, Beijing, China
2 College of Computer Science, Beijing Information Science and Technology University, Beijing, China
* Corresponding Author: Jiamin Wu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.077179

Received 03 December 2025; Accepted 18 March 2026; Published online 22 May 2026

Download PDF

Abstract

With the continuous improvement of the performance of large language models, how to further enhance their ability in complex tasks has become a key issue. The task of abnormal text detection poses a challenge to the model in identifying non-standard semantics due to its semantic complexity and high-risk features. However, existing fine-tuning methods rely heavily on static data selection strategies, making it difficult to adapt to the dynamic evolution of model capabilities, resulting in low training efficiency. This article proposes ADS (Adaptive Dataset Selection), an adaptive framework for selecting data in anomaly text detection. ADS performs model-aware data selection prior to fine-tuning, adapting the initial state of pre-trained language models by selecting samples that are most informative for the target anomaly detection task. Empirical results on mainstream large language model architectures show that ADS significantly compresses data size while still outperforming existing static strategies and mainstream compression methods. When using only 1000 fine-tuning samples, ADS achieves a 92% F1 score, with an accuracy improvement of over 22% compared to the baseline, demonstrating excellent performance. This study proposes an efficient data selection mechanism from the perspective of model capability and dynamic adaptation of data, providing theoretical support and a practical path for fine-tuning large models in low-resource scenarios.

Keywords

Adaptive dataset selection; anomalous text detection; fine-tuning; large language models; dynamic sample optimization; data diversity

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

182

View
41

Download
0

Like

A Novel Siamese Network for Few/Zero-Shot Handwritten Character Recognition Tasks
Nagwa Elaraby, Sherif Barakat,...
IoT-Cloud Assisted Botnet Detection Using Rat Swarm Optimizer with Deep Learning
Saeed Masoud Alshahrani, Fatma...
Sailfish Optimizer with Deep Transfer Learning-Enabled Arabic Handwriting Character Recognition
Mohammed Maray, Badriyya B. Al-onazi,...
Sparrow Search Optimization with Transfer Learning-Based Crowd Density Classification
Mohammad Yamin, Mishaal Mofleh...
A Framework of Deep Optimal Features Selection for Apple Leaf Diseases Recognition
Samra Rehman, Muhammad Attique...

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

ADS: Adaptive Dataset Selection for Fine-Tuning in Anomalous Text

Abstract

Keywords

182

41

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link