Comparing performances and effectiveness of machine learning classifiers in detecting financial accounting fraud for Turkish SMEs

Hamal S., ŞENVAR Ö.

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, cilt.14, sa.1, ss.769-782, 2021 (SCI İndekslerine Giren Dergi) identifier identifier

  • Cilt numarası: 14 Konu: 1
  • Basım Tarihi: 2021
  • Doi Numarası: 10.2991/ijcis.d.210203.007
  • Sayfa Sayıları: ss.769-782


Turkish small- and medium-sized enterprises (SMEs) are exposed to fraud risks and creditor banks are facing big challenges to deal with financial accounting fraud. This study explores effectiveness of machine learning classifiers in detecting financial accounting fraud assessing financial statements of 341 Turkish SMEs from 2013 to 2017. The data are obtained from one of the leading creditor banks of Turkey. Highly imbalanced classes of 1384 nonfraudulent cases and 321 fraudulent cases (by 122 firms) are detected thus sampling techniques are used to mitigate class imbalance problem. Research methodology consists of two stages. First stage is data preprocessing wherein financial ratio calculation, feature selection methods for defining financial ratios with the greatest impact on fraudulent financial statements and two sampling methods of Synthetic Minority Oversampling Technique (SMOTE) as oversampling and undersampling are performed, respectively. Second stage is performance evaluation and comparison of classifiers wherein seven different classifiers (support vector machine, Naive Bayes, artificial neural network, K-nearest neighbor, random forest, logistic regression, and bagging) are executed and compared by using performance metrics. Classifiers are also compared without using any feature selection and/or sampling techniques. Results reveal that random forestwithout feature selection-oversampling model outperforms all other models. (C) 2021 The Authors. Published by Atlantis Press B.V.