A Proposed Model for Persian Stance Detection on Social Media

Document Type : Original Article

Authors

1 Department of Information Technology Management, Science and Research Branch, Islamic Azad University, Tehran, Iran

2 Central Tehran Branch, Islamic Azad University Tehran, Iran

Abstract

Stance detection is a recent research topic that has become an emerging paradigm  of the importance of opinion-mining. It is intended to determine the author’s views toward a specific topic or claim. Stance detection has become an important module in numerous applications such as fake news detection, argument search, claim validation, and author profiling. Despite considerable progress made in this regard in languages like English, unfortunately, we have not made good progress in some languages such as Persian, where we are confronted with a lack of datasets in this area. In this paper, two solutions are used to address this issue: 1) the use of data augmentation and 2) the application of different learning approaches (machine learning, deep learning, and transfer learning) and a meaningful combination of their outcomes. The results show that each of these solutions can not only enhance stance detection performance, but when both are combined, a very significant improvement in the results is achieved.

Keywords

Main Subjects


  1. Sobhani, P., "Stance detection and analysis in social media", Universite d'Ottawa/University of Ottawa, (2017),
  2. Dutta, S., Caur, S., Chakrabarti, S. and Chakraborty, T., "Semi-supervised stance detection of tweets via distant network supervision", in Proceedings of the fifteenth ACM international conference on web search and data mining., (2022), 241-251.
  3. Küçük, D. and Can, F., "A tutorial on stance detection", in Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining., (2022), 1626-1628.
  4. Schiller, B., Daxenberger, J. and Gurevych, I., "Stance detection benchmark: How robust is your stance detection?", KI-Künstliche Intelligenz, (2021), 1-13. doi: 10.1007/s13218-021-00714-w.
  5. Kochkina, E., Liakata, M. and Augenstein, I., "Turing at semeval-2017 task 8: Sequential approach to rumour stance classification with branch-lstm", arXiv preprint arXiv:1704.07221, (2017). doi: 10.18653/v1/S17-2083.
  6. Yuan, C., Qian, W., Ma, Q., Zhou, W. and Hu, S., "Srlf: A stance-aware reinforcement learning framework for content-based rumor detection on social media", in 2021 International Joint Conference on Neural Networks (IJCNN), IEEE., (2021), 1-8.
  7. Zubiaga, A., Liakata, M., Procter, R., Wong Sak Hoi, G. and Tolmie, P., "Analysing how people orient to and spread rumours in social media by looking at conversational threads", PloS one, Vol. 11, No. 3, (2016), e0150989. doi: 10.1371/journal.pone.0150989.
  8. Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X. and Cherry, C., "Semeval-2016 task 6: Detecting stance in tweets", in Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), (2016), 31-41.
  9. Conforti, C., Berndt, J., Pilehvar, M.T., Giannitsarou, C., Toxvaerd, F. and Collier, N., "Will-they-won't-they: A very large dataset for stance detection on twitter", arXiv preprint arXiv:2005.00388, (2020). doi: 10.18653/v1/2020.acl-main.157.
  10. Sobhani, P., Inkpen, D. and Zhu, X., "A dataset for multi-target stance detection", in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers., (2017), 551-557.
  11. Du, J., Xu, R., He, Y. and Gui, L., "Stance classification with target-specific neural attention networks, International Joint Conferences on Artificial Intelligence., (2017).
  12. Lai, M., Cignarella, A.T., Farías, D.I.H., Bosco, C., Patti, V. and Rosso, P., "Multilingual stance detection in social media political debates", Computer Speech & Language, Vol. 63, (2020), 101075. doi: 10.1016/j.csl.2020.101075.
  13. Zotova, E., Agerri, R., Nuñez, M. and Rigau, G., "Multilingual stance detection in tweets: The catalonia independence corpus", in Proceedings of the Twelfth Language Resources and Evaluation Conference., (2020), 1368-1375.
  14. Swami, S., Khandelwal, A., Singh, V., Akhtar, S.S. and Shrivastava, M., "An english-hindi code-mixed corpus: Stance annotation and baseline system", arXiv preprint arXiv:1805.11868, (2018). doi: arXiv:1805.11868v1
  15. Darwish, K., Magdy, W. and Zanouda, T., "Trump vs. Hillary: What went viral during the 2016 us presidential election", in Social Informatics: 9th International Conference, SocInfo 2017, Oxford, UK, September 13-15, 2017, Proceedings, Part I 9, Springer. (2017), 143-161.
  16. Li, Y., He, H., Wang, S., Lau, F. and Song, Y., "Improved target-specific stance detection on social media platforms by delving into conversation threads", arXiv preprint arXiv:2211.03061, , (2022). doi: arXiv:2211.03061.
  17. Derczynski, L., Bontcheva, K., Liakata, M., Procter, R., Hoi, G.W.S. and Zubiaga, A., "Semeval-2017 task 8: Rumoureval: Determining rumour veracity and support for rumours", arXiv preprint arXiv:1704.05972, (2017). doi: 10.18653/v1/S17-2006.
  18. Bar-Haim, R., Bhattacharya, I., Dinuzzo, F., Saha, A. and Slonim, N., "Stance classification of context-dependent claims", in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers., (2017), 251-261.
  19. Cignarella, A.T., Lai, M., Bosco, C., Patti, V. and Paolo, R., "Sardistance@ evalita2020: Overview of the task on stance detection in italian tweets", in Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), Ceur., (2020), 1-10.
  20. Wei, P., Lin, J. and Mao, W., "Multi-target stance detection via a dynamic memory-augmented network", in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval., (2018), 1229-1232.
  21. Tutek, M., Sekulić, I., Gombar, P., Paljak, I., Čulinović, F., Boltužić, F., Karan, M., Alagić, D. and Šnajder, J., "Takelab at semeval-2016 task 6: Stance classification in tweets using a genetic algorithm based ensemble", in Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), (2016), 464-468.
  22. Zarharan, M., Ahangar, S., Rezvaninejad, F.S., Bidhendi, M.L., Pilevar, M.T., Minaei, B. and Eetemadi, S., "Persian stance classification data set", in TTO., (2019).
  23. Umer, M., Imtiaz, Z., Ullah, S., Mehmood, A., Choi, G.S. and On, B.-W., "Fake news stance detection using deep learning architecture (cnn-lstm)", IEEE Access, Vol. 8, (2020), 156695-156706. doi: 10.1109/ACCESS.2020.3019735.
  24. Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A.N., Gouws, S., Jones, L., Kaiser, Ł., Kalchbrenner, N. and Parmar, N., "Tensor2tensor for neural machine translation", arXiv preprint arXiv:1803.07416, (2018). doi: arXiv:1803.07416.
  25. Nasiri, H. and Analoui, M., "Persian stance detection with transfer learning and data augmentation", in 2022 27th International Computer Conference, Computer Society of Iran (CSICC), IEEE. (2022), 1-5.
  26. Karande, H., Walambe, R., Benjamin, V., Kotecha, K. and Raghu, T., "Stance detection with bert embeddings for credibility analysis of information on social media", PeerJ Computer Science, Vol. 7, (2021), e467. doi: 10.7717/peerj-cs.467.
  27. Khiabani, P.J. and Zubiaga, A., "Few-shot learning for cross-target stance detection by aggregating multimodal embeddings", arXiv preprint arXiv:2301.04535, (2023). doi: arXiv:2301.04535.
  28. Ren, Y., Liu, Y., Chen, J., Guo, X., Shi, J. and Jia, M., "News stance discrimination based on a heterogeneous network of social background information fusion", Entropy, Vol. 25, No. 1, (2022), 78. doi: 10.3390/e25010078.
  29. Lozhnikov, N., Derczynski, L. and Mazzara, M., "Stance prediction for russian: Data and analysis", in Proceedings of 6th International Conference in Software Engineering for Defence Applications: SEDA 2018 6, Springer., (2020), 176-186.
  30. Dlamini, G., Bekkouch, I.E.I., Khan, A. and Derczynski, L., "Bridging the domain gap for stance detection for the zulu language", in Intelligent Systems and Applications: Proceedings of the 2022 Intelligent Systems Conference (IntelliSys) Volume 1, Springer., (2022), 312-325.
  31. Feng, S.Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T. and Hovy, E., "A survey of data augmentation approaches for nlp", arXiv preprint arXiv:2105.03075, (2021). doi: 10.18653/v1/2021.findings-acl.84.
  32. Yin, F., Yao, Z. and Liu, J., "Character-level attention convolutional neural networks for short-text classification", in Human Centered Computing: 5th International Conference, HCC 2019, Čačak, Serbia, August 5–7, 2019, Revised Selected Papers 5, Springer., (2019), 560-567.
  33. Liu, R., Xu, G., Jia, C., Ma, W., Wang, L. and Vosoughi, S., "Data boost: Text data augmentation through reinforcement learning guided conditional generation", arXiv preprint arXiv:2012.02952, (2020). doi: 10.18653/v1/2020.emnlp-main.726.
  34. Mi, A., Wang, L. and Qi, J., "A multiple classifier fusion algorithm using weighted decision templates", Scientific Programming, Vol. 2016, (2016). doi: 10.1155/2016/3943859.
  35. Liang, S., Han, D. and Han, C., "A novel diversity measure based on geometric relationship and its application to design of multiple classifier systems", Acta Automatica Sinica, Vol. 40, No. 3, (2014), 449-458. doi: 10.3724/SP.J.1004.2014.00449.
  36. Chen, S., Lu, X., Chen, X., Chen, M., Chen, J., Wang, D. and Zhu, S., "Object tracking with multi-classifier fusion based on compressive sensing and multiple instance learning", Mathematical Problems in Engineering, Vol. 2020, (2020), 1-17. doi: 10.1155/2020/1574054.
  37. Bagheri, M.A., Hu, G., Gao, Q. and Escalera, S., "A framework of multi-classifier fusion for human action recognition", in 2014 22nd International Conference on Pattern Recognition, IEEE., (2014), 1260-1265.
  38. Pan, Y., Zhang, L., Wu, X. and Skibniewski, M.J., "Multi-classifier information fusion in risk analysis", Information Fusion, Vol. 60, (2020), 121-136. doi: 10.1109/ICPR.2014.226.
  39. Huang, J.-T., Wang, M.-H., Li, W.-J. and Gu, B., "Multiple classifier fault diagnosis system based on dynamic weight", Acta Electonica Sinica, Vol. 40, No. 4, (2012), 734. doi: 10.3969/j.issn.0372-2112.2012.04.018.
  40. Yu, Z., Nam, M.Y., Sedai, S. and Rhee, P.K., "Evolutionary fusion of a multi-classifier system for efficient face recognition", International Journal of Control, Automation and Systems, Vol. 7, (2009), 33-40. doi: 10.1007/s12555-009-0105-z.
  41. Li, B., Hou, Y. and Che, W., "Data augmentation approaches in natural language processing: A survey", AI Open, Vol. 3, (2022), 71-90. doi: 10.1016/j.aiopen.2022.03.001.
  42. Beddiar, D.R., Jahan, M.S. and Oussalah, M., "Data expansion using back translation and paraphrasing for hate speech detection", Online Social Networks and Media, Vol. 24, (2021), 100153. doi: 10.1016/j.osnem.2021.100153.
  43. Maharana, K., Mondal, S. and Nemade, B., "A review: Data pre-processing and data augmentation techniques", Global Transitions Proceedings, (2022). doi: 10.1016/j.gltp.2022.04.020.
  44. Qader, W.A., Ameen, M.M. and Ahmed, B.I., "An overview of bag of words; importance, implementation, applications, and challenges", in 2019 international engineering conference (IEC), IEEE., (2019), 200-204.
  45. Aizawa, A., "An information-theoretic perspective of tf–idf measures", Information Processing & Management, Vol. 39, No. 1, (2003), 45-65. doi: 10.1016/S0306-4573(02)00021-3.
  46. d'Sa, A.G., Illina, I. and Fohr, D., "Bert and fasttext embeddings for automatic detection of toxic speech", in 2020 International Multi-Conference on:“Organization of Knowledge and Advanced Technologies”(OCTA), IEEE., (2020), 1-5.
  47. Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H. and Mikolov, T., "Fasttext. Zip: Compressing text classification models", arXiv preprint arXiv:1612.03651, (2016). doi: arXiv:1612.03651.
  48. Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K., "Bert: Pre-training of deep bidirectional transformers for language understanding", arXiv preprint arXiv:1810.04805, (2018). doi: arXiv:1810.04805.
  49. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I., "Attention is all you need", Advances in Neural Information Processing Systems, Vol. 30, (2017). doi: 10.48550/arXiv.1706.03762.
  50. Farahani, M., Gharachorloo, M., Farahani, M. and Manthouri, M., "Parsbert: Transformer-based model for persian language understanding", Neural Processing Letters, Vol. 53, (2021), 3831-3847. doi: 10.1007/s11063-021-10528-4.
  51. Kuncheva, L.I., "Combining pattern classifiers: Methods and algorithms, John Wiley & Sons, (2014).
  52. Moreno-Seco, F., Inesta, J.M., De León, P.J.P. and Micó, L., "Comparison of classifier fusion methods for classification in pattern recognition tasks", in Structural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshops, SSPR 2006 and SPR 2006, Hong Kong, China, August 17-19, 2006. Proceedings, Springer., (2006), 705-713.
  53. Du, P., Xia, J., Zhang, W., Tan, K., Liu, Y. and Liu, S., "Multiple classifier system for remote sensing image classification: A review", Sensors, Vol. 12, No. 4, (2012), 4764-4792. doi: 10.3390/s120404764.
  54. Kingma, D.P. and Ba, J., "Adam: A method for stochastic optimization", arXiv preprint arXiv:1412.6980, (2014). doi: arXiv:1412.6980.