Fast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies

Document Type : Original Article

Authors

1 Son Corporate Group, Tehran, Iran

2 Department of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran

Abstract

Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests the usage of unsupervised models for fraud data collection. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto-insurance fraud detection, specifically. However, this approach is not scalable to large samples and is not appropriate for online fraud detection. This is while, real-time fraud management systems are necessary to prevent huge losses. In this study, we propose an implementation methodology which makes it possible to apply the SRA to big data senarios. We exploit the power of spectral ranking of anomalies to create an estimated target variable from the unlabeled dataset. We then use two robust  models, namely, random forest and deep neural networks to fit a model based on the estimated labeled training set. Next, the incoming live data are fed to the mentioned trained models for predicting the target variable. Simulation results confirm that the proposed approach has higher speed and acceptable false alarm rate compared to existing related methods.

Keywords


1. Tennyson. S, and Salsas-Forn. P, “Claims Auditing in Automobile Insurance: Fraud Detection and Deterrence Objectives”, Journal of Risk and Insurance, Vol. 69, No. 3, (2002), 289-308, doi: 10.1111/1539-6975.00024.
2. Artis. M, Ayuso. M, and Guillén. M, “Detection of Automobile Insurance Fraud with Discrete Choice Models and Misclassified Claims”, Journal of Risk and Insurance, Vol. 69, No. 3, (2002), 325-340, doi: 10.1111/1539-6975.00022.
3. Viaene. S, Derrig. R. A, Baesens. B, and Dedene. G, “A Comparison of State-Of-The-Art Classification Techniques For Expert Automobile Insurance Claim Fraud Detection”, Journal of Risk and Insurance, Vol. 69, No. 3, (2002), 373-421, doi: 10.1111/1539-6975.00023.
4. Brockett. P. L, Xia. X, and Derrig. R. A, “Using Khonen’s Self Organizing Feature Map to Uncover Automobile Bodily Injury Claim Fraud”, Journal of Risk and Insurance, Vol. 65, No. 2, (1998), 245-274, doi: 10.2307/253535.
5. Brockett . P. L, Derrig. R. A, Golden. L. L, Levine. A, and Alpert. M “Fraud Classification Using Principal Component Analysis of RIDITs”, Journal of Risk and Insurance, Vol. 69, No. 3, (2002) 341-371, doi: 10.1111/1539-6975.00027.
6. Nian. K, Zhang. H, Tayal. A, Coleman. Th, and Li. Y, “Unsupervised Spectral Ranking For Anomaly and Application to Auto Insurance Fraud Detection”, Journal of Finance and Data Science, Vol. 2, No. 1, (2016), 58-75, doi: 10.1016/j.jfds.2016.03.001.
7. Boriah. S, Chandola. V, and Kumar. V, “Similarity measures for categorical data: A comparative evaluation”, In Proceedings of the eighth SIAM International Conference on Data Mining, 243-254, (2008).
8. Gartner. T, Le. Q. V, and Smola. A. J, “A short tour of kernel methods for graphs”, Technical report, NICTA, Australia, Canberra, (2006).
9. Couto. J, “Kernel k-means for categorical data”, Lecture Notes in Computer Science, Springer, (2005), 46-56, doi: 10.1007/11552253_5.
10. Gower. J. C, “A General Coefficient of Similarity and Some of Its Properties”, Biometrics, Vol. 27, No. 4, (1971), 857-871, doi: 10.2307/2528823.
11. Subudhi. Sh, and Panigrahi. S, “Detection of Automobile Insurance Fraud Using Feature Selection and Data Mining Techniques”, International Journal of Rough Sets and Data Analysis, Vol. 5, No. 3, (2018), 1-20, doi: 10.4018/IJRSDA.2018070101.
12. Wang. Y, and Xu. W, “Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud”,
1248 Z. Shaeiri and S. J. Kazemitabar / IJE TRANSACTIONS A: Basics Vol. 33, No. 7, (July 2020) 1240-1248
Decision Support Systems, Elsevier, Vol. 105, (2018), 87-95, doi: 10.1016/j.dss.2017.11.001.
13. Li. Y, Yan. C, Liu. W, and Li. M, “A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification”, Applied Soft Computing, Elsevier, Vol. 70, (2018), 1000-1009, doi: 10.1016/j.asoc.2017.07.027.
14. Bosagh-Zadeh. R, Meng. X, Ulanov. A, Yavus. B, Pu. L, Venkataraman. Sh, Sparks. E, Staple. A, Zaharia. M, “Matrix computations and optimization in Apache Spark”, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 31-38, (2016).
15. Tennyson. Sh, and Salsas‐Forn. P, “Claims Auditing in Automobile Insurance: Fraud Detection and Deterrence Objectives”, The Journal of Risk and Insurance, Vol. 69, No. 3, (2002), 289-308, doi: 10.1111/1539-6975.00024.
16. Itri. B, Mohamed. Y, Mohamed. Q, and Omar. B, “Performance comparative study of machine learning algorithms for automobile insurance fraud detection”, Third International Conference on Intelligent Computing in Data Sciences (ICDS), (2019).
17. Roy. R, George. K. Th, “Detecting insurance claims fraud using machine learning techniques”, International Conference on Circuit ,Power and Computing Technologies (ICCPCT), (2017).
18. Stephen-Kalwihura. J, and Logesvaran. R, “Auto-Insurance fraud detection: a behavioral feature engineering approach”, Journal of Critical Reviews, Vol. 7, No. 3, (2020), 125-129, doi: 10.31838/jcr.07.03.23.
19. Abdallah. A, Aaizaini-Maarof. M, and Zainal. A, “Fraud detection system: A survey”, Journal of Network and Computer Applications, Vol. 68, (2016), 90-113, doi: 10.1016/j.jnca.2016.04.007.
20. Phua. C, Lee. V, Smith. K, and Gayler. R, “A Comprehensive Survey of Data Mining-based Fraud Detection Research”, Computers in Human Behavior, Vol. 28, (2012), 1002–1013, doi: 10.1016/j.chb.2012.01.002.
21. Phua. C, Alahakoon. D, and Lee. V, “Minority report in fraud detection: classification of skewed data”, ACM SIGKDD Explorations, Vol. 6, No. 1, (2004), 50-59, 10.1145/1007730.1007738.
22. H. Hassanpour, and A. Darvishi, “A Geometric View of Similarity Measures in Data Mining”, International Journal of Engineering, Transactions C: Aspects Vol. 28, No. 12, (2015), 1728-1737. doi: 10.5829/idosi.ije.2015.28.12c.05
Persian Abstract