Open Access iconOpen Access

ARTICLE

crossmark

Text-Independent Algorithm for Source Printer Identification Based on Ensemble Learning

Naglaa F. El Abady1,*, Mohamed Taha1, Hala H. Zayed1,2

1 Department of Computer Science, Faculty of Computers and Artificial Intelligence, Benha University, 13518, Egypt
2 School of Information Technology and Computer Science (ITCS), Nile University, 12677, Egypt

* Corresponding Author: Naglaa F. El Abady. Email: email

Computers, Materials & Continua 2022, 73(1), 1417-1436. https://doi.org/10.32604/cmc.2022.028044

Abstract

Because of the widespread availability of low-cost printers and scanners, document forgery has become extremely popular. Watermarks or signatures are used to protect important papers such as certificates, passports, and identification cards. Identifying the origins of printed documents is helpful for criminal investigations and also for authenticating digital versions of a document in today’s world. Source printer identification (SPI) has become increasingly popular for identifying frauds in printed documents. This paper provides a proposed algorithm for identifying the source printer and categorizing the questioned document into one of the printer classes. A dataset of 1200 papers from 20 distinct (13) laser and (7) inkjet printers achieved significant identification results. A proposed algorithm based on global features such as the Histogram of Oriented Gradient (HOG) and local features such as Local Binary Pattern (LBP) descriptors has been proposed for printer identification. For classification, Decision Trees (DT), k-Nearest Neighbors (k-NN), Random Forests, Aggregate bootstrapping (bagging), Adaptive-boosting (boosting), Support Vector Machine (SVM), and mixtures of these classifiers have been employed. The proposed algorithm can accurately classify the questioned documents into their appropriate printer classes. The adaptive boosting classifier attained a 96% accuracy. The proposed algorithm is compared to four recently published algorithms that used the same dataset and gives better classification accuracy.

Keywords


Cite This Article

N. F. El Abady, M. Taha and H. H. Zayed, "Text-independent algorithm for source printer identification based on ensemble learning," Computers, Materials & Continua, vol. 73, no.1, pp. 1417–1436, 2022. https://doi.org/10.32604/cmc.2022.028044



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1178

    View

  • 579

    Download

  • 0

    Like

Share Link