Document Type: Original Article
Son Corporate Group, Tehran, Iran
Department of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran
Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests the usage of unsupervised models for fraud data collection. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto-insurance fraud detection, specifically. However, this approach is not scalable to large samples and is not appropriate for online fraud detection. This is while, real-time fraud management systems are necessary to prevent huge losses. In this study, we propose an implementation methodology which makes it possible to apply the SRA to big data senarios. We exploit the power of spectral ranking of anomalies to create an estimated target variable from the unlabeled dataset. We then use two robust models, namely, random forest and deep neural networks to ﬁt a model based on the estimated labeled training set. Next, the incoming live data are fed to the mentioned trained models for predicting the target variable. Simulation results conﬁrm that the proposed approach has higher speed and acceptable false alarm rate compared to existing related methods.