Open Access iconOpen Access



Automated File Labeling for Heterogeneous Files Organization Using Machine Learning

Sagheer Abbas1, Syed Ali Raza1,2, M. A. Khan3, Muhammad Adnan Khan4,*, Atta-ur-Rahman5, Kiran Sultan6, Amir Mosavi7,8,9

1 School of Computer Science, National College of Business Administration & Economics, Lahore, 54000, Pakistan
2 Department of Computer Science, GC University Lahore, Pakistan
3 Riphah School of Computing & Innovation, Faculty of Computing, Riphah International University, Lahore Campus, Lahore, 54000, Pakistan
4 Department of Software, Pattern Recognition and Machine Learning Lab, Gachon University, Seongnam, 13120, Korea
5 Department of Computer Science, College of Computer Science and Information Technology (CCSIT), Imam Abdulrahman Bin Faisal University (IAU), P.O. Box 1982, Dammam, 31441, Saudi Arabia
6 Department of CIT, The Applied College, King Abdulaziz University, Jeddah, 31261, Saudi Arabia
7 John von Neumann Faculty of Informatics, Obuda University, Budapest, 1034, Hungary
8 Institute of Information Engineering, Automation and Mathematics, Slovak University of Technology in Bratislava, Bratislava, 81107, Slovakia
9 Faculty of Civil Engineering, TU-Dresden, Dresden, 01062, Germany

* Corresponding Author: Muhammad Adnan Khan. Email: email

Computers, Materials & Continua 2023, 74(2), 3263-3278.


File labeling techniques have a long history in analyzing the anthological trends in computational linguistics. The situation becomes worse in the case of files downloaded into systems from the Internet. Currently, most users either have to change file names manually or leave a meaningless name of the files, which increases the time to search required files and results in redundancy and duplications of user files. Currently, no significant work is done on automated file labeling during the organization of heterogeneous user files. A few attempts have been made in topic modeling. However, one major drawback of current topic modeling approaches is better results. They rely on specific language types and domain similarity of the data. In this research, machine learning approaches have been employed to analyze and extract the information from heterogeneous corpus. A different file labeling technique has also been used to get the meaningful and `cohesive topic of the files. The results show that the proposed methodology can generate relevant and context-sensitive names for heterogeneous data files and provide additional insight into automated file labeling in operating systems.


Cite This Article

S. Abbas, S. A. Raza, M. A. Khan, M. A. Khan, . Atta-ur-Rahman et al., "Automated file labeling for heterogeneous files organization using machine learning," Computers, Materials & Continua, vol. 74, no.2, pp. 3263–3278, 2023.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1929


  • 347


  • 0


Share Link