Open Access iconOpen Access



A Multi-Module Machine Learning Approach to Detect Tax Fraud

N. Alsadhan*

King Saud University, Riyadh, 11451, Saudi Arabia

* Corresponding Author: N. Alsadhan. Email: email

Computer Systems Science and Engineering 2023, 46(1), 241-253.


Tax fraud is one of the substantial issues affecting governments around the world. It is defined as the intentional alteration of information provided on a tax return to reduce someone’s tax liability. This is done by either reducing sales or increasing purchases. According to recent studies, governments lose over $500 billion annually due to tax fraud. A loss of this magnitude motivates tax authorities worldwide to implement efficient fraud detection strategies. Most of the work done in tax fraud using machine learning is centered on supervised models. A significant drawback of this approach is that it requires tax returns that have been previously audited, which constitutes a small percentage of the data. Other strategies focus on using unsupervised models that utilize the whole data when they search for patterns, though ignore whether the tax returns are fraudulent or not. Therefore, unsupervised models are limited in their usefulness if they are used independently to detect tax fraud. The work done in this paper focuses on addressing such limitations by proposing a fraud detection framework that utilizes supervised and unsupervised models to exploit the entire set of tax returns. The framework consists of four modules: A supervised module, which utilizes a tree-based model to extract knowledge from the data; an unsupervised module, which calculates anomaly scores; a behavioral module, which assigns a compliance score for each taxpayer; and a prediction module, which utilizes the output of the previous modules to output a probability of fraud for each tax return. We demonstrate the effectiveness of our framework by testing it on existent tax returns provided by the Saudi tax authority.


Cite This Article

APA Style
Alsadhan, N. (2023). A multi-module machine learning approach to detect tax fraud. Computer Systems Science and Engineering, 46(1), 241-253.
Vancouver Style
Alsadhan N. A multi-module machine learning approach to detect tax fraud. Comput Syst Sci Eng. 2023;46(1):241-253
IEEE Style
N. Alsadhan, "A Multi-Module Machine Learning Approach to Detect Tax Fraud," Comput. Syst. Sci. Eng., vol. 46, no. 1, pp. 241-253. 2023.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1186


  • 557


  • 0


Share Link