Determining the Composition Functions of Persian Non-standard Sentences in Terminology using a Deep Learning Fuzzy Neural Network Model

Document Type : Original Article

Author

Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran

Abstract

Organizations can enhance the speed of well-informed decision-making by correctly understanding and using data. Since there is a tremendous gap between the speed of data processing and data generation in the world, exploring data mining in the digital world becomes inevitable. In the Persian language, similar to other languages, with the expansion of communications through social networks, the spelling of words has become abridged and the engagement of foreign loan words and emoticons has been increasing on a daily basis. Given the richness of Persian and its typographical-grammatical similarities to Arabic, research in Persian can be applied to other akin languages as well.  In this regard, the current study deals with data mining of Persian non-standard sentences in order to find the function of each word in the sentence. The volume of computation might be limited in traditional methods of natural language processing for each factor contributing to functions. That is because the minimum number of computations is (5 × number of words 9) + (5 × number of words 15). Therefore, this study adopted the Gated Recurrent Unit (GRU) method to process such computations. The newly proposed method reinforces the results of word function identification by using two categories of "independent" and "dependent" Persian language functions as well as five factors contributing to the functions of words in sentences as five output gates. Meanwhile, the values of the training tables in this method are fuzzy, where the center-of-gravity fuzzy method is adopted to decide on the fuzzy values as well as to reduce the complexity and ambiguity of such computations on the probability of each event occurring. Therefore, the new method is briefly called "fuzzy GRU". The results show that the proposed algorithm achieves 80 % reduction in the amount of calculations per gate of updates and reinforcement is approximately 2 % up from 67 % in standard sentences to 69 % of the non-standard sentences.

Keywords


1.     “Persian language, Encyclopædia Britannica.”, Written by The Editors of Encyclopaedia Britannica, Retrieved from https://www.britannica.com/topic/Persian-language
2.     Alshammari, M., Nasraoui, O., and Sanders, S. “Mining Semantic Knowledge Graphs to Add Explainability to Black Box Recommender Systems.” IEEE Access, Vol. 7, (2019), 110563–110579. https://doi.org/10.1109/access.2019.2934633
3.     Heo, Y., Kang, S., and Yoo, D. “Multimodal Neural Machine Translation with Weakly Labeled Images.” IEEE Access, Vol. 7, (2019), 54042–54053. https://doi.org/10.1109/ACCESS.2019.2911656
4.     Wu, B., Li, K., Ge, F., Huang, Z., Yang, M., Siniscalchi, S. M., and Lee, C. H. L. “An end-to-end deep learning approach to simultaneous speech dereverberation and acoustic modeling for robust speech recognition.” IEEE Journal on Selected Topics in Signal Processing, Vol. 11, No. 8, (2017), 1289–1300. https://doi.org/10.1109/JSTSP.2017.2756439
5.     Vani, H., and Anusuya, M. “Fuzzy Speech Recognition: A Review.” International Journal of Computer Applications, Vol. 177, No. 47, (2020), 39–54. https://doi.org/10.5120/ijca2020919989
6.     Keysers, D., Deselaers, T., Rowley, H. A., Wang, L. L., and Carbune, V. “Multi-Language Online Handwriting Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 6, (2017), 1180–1194. https://doi.org/10.1109/TPAMI.2016.2572693
7.     Jayashree, R., Murthy, S. K., and Sunny, K. “Keyword extraction based summarization of categorized Kannada text documents.” International Journal on Soft Computing, Vol. 2, No. 4, (2011), 81–93. https://doi.org/10.5121/ijsc.2011.2408
8.     Motameni, H., and Peykar, A. “Morphology of compounds as standard words in Persian through hidden Markov model and fuzzy method.” Journal of Intelligent and Fuzzy Systems, Vol. 30, No. 3, (2016), 1567–1580. https://doi.org/10.3233/IFS-151865
9.     Graves, A. “Generating Sequences With Recurrent Neural Networks.” arXiv:1308.0850, Vol. 5, (2014), 1–43. Retrieved from http://arxiv.org/abs/1308.0850
10.   Dim Lam, C., and Khin Mar, S. Joint Word Segmentation and Part-of-Speech Tagging for Myanmar Language, PhD Dissertation, University of Computer Studies, Yangon. Retrieved from http://onlineresource.ucsy.edu.mm/handle/123456789/2530
11.   Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.” arXiv:1412.3555, (2014). Retrieved from http://arxiv.org/abs/1412.3555
12.   Obin, N., and Lanchantin, P. “Symbolic modeling of prosody: From linguistics to statistics.” IEEE Transactions on Audio, Speech and Language Processing, Vol. 23, No. 3, (2015), 588–599. https://doi.org/10.1109/TASLP.2014.2387389
13.   Shin, Y., and Xu, C. Intelligent systems: modeling, optimization, and control. CRC press, 2017.
14.   Natel Khanlari, P. “Linguistics and Persian Language.” Tehran: Toos Publication, 1987.
15.   Li, F. K. “A Short History of Linguistics, R. H. Robins.” American Anthropologist, Vol. 70, No. 6, (1968), 1186–1186. https://doi.org/10.1525/aa.1968.70.6.02a00210
16.   Moniri, M. “Fuzzy and intuitionistic fuzzy turing machines.” Fundamenta Informaticae, Vol. 123, No. 3, (2013), 305–315. https://doi.org/10.3233/FI-2013-812
17.   Taheri, A., Meghdari, A., Alemi, M., and Pouretemad, H. R. “Teaching music to children with autism: A social robotics challenge.” Scientia Iranica, Vol. 26, No. 1, (2019), 40–58. https://doi.org/10.24200/sci.2017.4608
18.   Mitkov, R. The Oxford handbook of computational linguistics, Oxford University Press, 2004.
19.   Tatar, D. “Word Sense Disambiguation by Machine Learning Approach: A Short Survey.” Fundamenta Informaticae, Vol. 64, No. 1–4, (2005), 433–442.
20.   Hinrichs, E. W., Meurers, W. D., and Wintner, S. “Linguistic Theory and Grammar Implementation: Introduction to this Special Issue.” Research on Language and Computation, Vol. 2, No. 2, (2004), 155–163. https://doi.org/10.1023/b:rolc.0000016748.09606.a9
21.   Bijankhan, M., Sheykhzadegan, J., Bahrani, M., and Ghayoomi, M. “Lessons from building a Persian written corpus: Peykare.” Language Resources and Evaluation, Vol. 45, No. 2, (2011), 143–164. https://doi.org/10.1007/s10579-010-9132-x
22.   Shamsfard, M., Ilbeygi, M., and Sadat Jafari, H. “STeP-1: A Set of Fundamental Tools for Persian Text Processing.” In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), (2010), 859–865. Retrieved from https://www.researchgate.net/publication/220746093
23.   Assi, S. M. “Grammatical Tagging of a Persian Corpus.” International Journal of Corpus Linguistics, Vol. 5, No. 1, (2000), 69–81. https://doi.org/10.1075/ijcl.5.1.05ass
24.   Natural Language Processing Software of Ferdowsi University of Mashhad Version 1.3.(persian)," Web Technology Lab of Ferdowsi University of Mashhad, Mashhad, 2012.
25.   Safari, A., Hosseini, R., and Mazinani, M. “A Novel Type-2 Adaptive Neuro Fuzzy Inference System Classifier for Modelling Uncertainty in Prediction of Air Pollution Disaster.” International Journal of Engineering, Transactions B: Applications, Vol. 30, No. 11, (2017), 1746–1751. https://doi.org/10.5829/ije.2017.30.11b.16
26.   Sadeghi, H., Motameni, H., Ebrahimnejad, A., and Vahidi, J. “Morphology of composition functions in Persian sentences through a newly proposed classified fuzzy method and center of gravity defuzzification method.” Journal of Intelligent and Fuzzy Systems, Vol. 36, No. 6, (2019), 5463–5473. https://doi.org/10.3233/JIFS-181330
27.   Haspelmath, M., and Sims, A. Understanding morphology, London: Hodder Education and Hachette UK Company, 2010.
28.   Geeraerts, D., and Cuyckens, H. The Oxford Handbook of Cognitive Linguistics. Oxford University Press, 2007. https://doi.org/10.1093/oxfordhb/9780199738632.001.0001
29.   Perry, J. R. “Persian Morphology.” In Morphologies of Asia and Africa, Winona, EIS - Eisenbrauns, (pp. 975–1019), 2007.
30.   Amidi, A., Amidi, S., ‘Super VIP Cheatsheet: Machine Learning,’ https://stanford.edu/~shervine/, stanford, 2018.
31.           Peykar, A. Pars Process Persian sentence analyzer software, Gorgan: Golestan University, Faculty of Basic Sciences, 2011.