Robot Arm Performing Writing through Speech Recognition Using Dynamic Time Warping Algorithm

Authors

Department of Mechanical Engineering, SRM University, Chennai-603203

Abstract

This paper aims to develop a writing robot by recognizing the speech signal from the user. The robot arm constructed mainly for the disabled people who can’t perform writing on their own. Here, dynamic time warping (DTW) algorithm is used to recognize the speech signal from the user. The action performed by the robot arm in the environment is done by reducing the redundancy which frequently faced by the robot arm with high accuracy in both velocity and position in its own trajectory.

Keywords


1.     Duchaine, V., St-Onge, B.M., Gao, D. and Gosselin, C., "Stable and intuitive control of an intelligent assist device", IEEE Transactions on Haptics,  Vol. 5, No. 2, (2012), 148-159.
2.     Escobar, F., Díaz, S., Gutiérrez, C., Ledeneva, Y., Hernández, C., Rodríguez, D. and Lemus, R., "Simulation of control of a scara robot actuated by pneumatic artificial muscles using rnapm", Journal of Applied Research and Technology,  Vol. 12, No. 5, (2014), 939-946.
3.     Ibrahim, B. and Zargoun, A.M., "Modelling and control of scara manipulator", Procedia Computer Science,  Vol. 42, No., (2014), 106-113.
4.     Furui, S., Kikuchi, T., Shinnaka, Y. and Hori, C., "Speech-to-text and speech-to-speech summarization of spontaneous speech", IEEE Transactions on Speech and Audio Processing,  Vol. 12, No. 4, (2004), 401-408.
5.     Khilari, M.P. and Bhope, V., "A review on speech to text conversion methods".
6.     Stolcke, A., Chen, B., Franco, H., Gadde, V.R.R., Graciarena, M., Hwang, M.-Y., Kirchhoff, K., Mandal, A., Morgan, N. and Lei, X., "Recent innovations in speech-to-text transcription at sri-icsi-uw", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 14, No. 5, (2006), 1729-1744.
7.     Muda, L., Begam, M. and Elamvazuthi, I., "Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (DTW) techniques", arXiv Preprint arXiv:1003.4083,  (2010).
8.     Pérez-Marcos, D., Buitrago, J.A. and Velásquez, F.D.G., "Writing through a robot: A proof of concept for a brain–machine interface", Medical Engineering & Physics,  Vol. 33, No. 10, (2011), 1314-1317.
9.     Potkonjak, V., Popovic, M., Lazarevic, M. and Sinanovic, J., "Redundancy problem in writing: From human to anthropomorphic robot arm", IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics),  Vol. 28, No. 6, (1998), 790-805.
10.   Fujioka, H., Kano, H., Nakata, H. and Shinoda, H., "Constructing and reconstructing characters, words, and sentences by synthesizing writing motions", IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans,  Vol. 36, No. 4, (2006), 661-670.
11.   Mansour, A.H., Salh, G.Z.A. and Mohammed, K.A., "Voice recognition using dynamic time warping and mel-frequency cepstral coefficients algorithms", International Journal of Computer Applications,  Vol. 116, No. 2, (2015).
12.   Mouchtaris, A., Van der Spiegel, J. and Mueller, P., "Nonparallel training for voice conversion based on a parameter adaptation approach", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 14, No. 3, (2006), 952-963.
13.   Hu, Y. and Loizou, P.C., "Evaluation of objective quality measures for speech enhancement", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 16, No. 1, (2008), 229-238.
14.   Turk, O. and Schroder, M., "Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 18, No. 5, (2010), 965-973.
15.   Wu, C.-H., Hsia, C.-C., Liu, T.-H. and Wang, J.-F., "Voice conversion using duration-embedded bi-hmms for expressive speech synthesis", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 14, No. 4, (2006), 1109-1116.
16.   Sakoe, H. and Chiba, S., "Dynamic programming algorithm optimization for spoken word recognition", IEEE Transactions on Acoustics, Speech, and Signal Processing,  Vol. 26, No. 1, (1978), 43-49.
17.   Brown, M. and Rabiner, L., "An adaptive, ordered, graph search technique for dynamic time warping for isolated word recognition", IEEE Transactions on Acoustics, Speech, and Signal Processing,  Vol. 30, No. 4, (1982), 535-544.
18.   Khoubrouy, S.A. and Hansen, J.H., "Microphone array processing strategies for distant-based automatic speech recognition", IEEE Signal Processing Letters,  Vol. 23, No. 10, (2016), 1344-1348.
19.   Ghaffarzadegan, S., Bořil, H. and Hansen, J.H., "Generative modeling of pseudo-whisper for robust whispered speech recognition", IEEE/ACM Transactions on Audio, Speech, and Language Processing,  Vol. 24, No. 10, (2016), 1705-1720.
20.   Ma, X., Wang, D. and Tejedor, J., "Similar word model for unfrequent word enhancement in speech recognition", IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP),  Vol. 24, No. 10, (2016), 1819-1830.
21.   Nair, K.S., "Kinematic modelling and analysis of a 5 axis articulated robot arm model VRT-502", in IEEE Transactions, Mar Baselios College of Engineering and Technology, Thiruvananthapuram, Kerala, India. Vol. 4, (2015).
22.   Carlson, J.S., Spensieri, D., Wärmefjord, K., Segeborn, J. and Söderberg, R., "Minimizing dimensional variation and robot traveling time in welding stations", Procedia Cirp,  Vol. 23, (2014), 77-82.
23.   Fontes, J.V. and da Silva, M.M., "On the dynamic performance of parallel kinematic manipulators with actuation and kinematic redundancies", Mechanism and Machine Theory,  Vol. 103, (2016), 148-166.
24.   Lee, S. and Pan, J.C., "Offline tracing and representation of signatures", IEEE Transactions on Systems, Man, and Cybernetics,  Vol. 22, No. 4, (1992), 755-771.
25.   Yao, F., Shao, G. and Yi, J., "Extracting the trajectory of writing brush in chinese character calligraphy", Engineering Applications of Artificial Intelligence,  Vol. 17, No. 6, (2004), 631-644.
26.   Kulvicius, T., Ning, K., Tamosiunaite, M. and Worgötter, F., "Joining movement sequences: Modified dynamic movement primitives for robotics applications exemplified on handwriting", IEEE Transactions on Robotics,  Vol. 28, No. 1, (2012), 145-157.
27.   Waarsing, B.J., Nuttin, M., Van Brussel, H. and Corteville, B., "From biological inspiration toward next-generation manipulators: Manipulator control focused on human tasks", IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews),  Vol. 35, No. 1, (2005), 53-65.
28.   Dolinsky, J. and Takagi, H., "Analysis and modeling of naturalness in handwritten characters", IEEE Transactions on Neural Networks,  Vol. 20, No. 10, (2009), 1540-1553.
29.   Lee, S. and Pan, J.-J., "Unconstrained handwritten numeral recognition based on radial basis competitive and cooperative networks with spatio-temporal feature representation", IEEE Transactions on Neural Networks,  Vol. 7, No. 2, (1996), 455-474.
30.   Nguyen, L.A., Walker, I.D. and Defigueiredo, R., "Dynamic control of flexible, kinematically redundant robot manipulators", IEEE Transactions on Robotics and Automation,  Vol. 8, No. 6, (1992), 759-767.
31.   Nisky, I., Hsieh, M.H. and Okamura, A.M., "Uncontrolled manifold analysis of arm joint angle variability during robotic teleoperation and freehand movement of surgeons and novices", IEEE Transactions on Biomedical Engineering,  Vol. 61, No. 12, (2014), 2869-2881.
32.   Gomez-Espinosa, A., Lafuente-Ramon, P., Rebollar-Huerta, C., Hernandez-Maldonado, M., Olguin-Callejas, E., Jimenez-Hernandez, H., Rivas-Araiza, E. and Rodriguez-Resendiz, J., "Design and construction of a didactic 3-dof parallel links robot station with a 1-dof gripper", Journal of Applied Research and Technology,  Vol. 12, No. 3, (2014), 435-443.
33.   Myers, C. and Rabiner, L., "Connected digit recognition using a level-building dtw algorithm", IEEE Transactions on Acoustics, Speech, and Signal Processing,  Vol. 29, No. 3, (1981), 351-363.
34.   Nakamura, S., Markov, K., Nakaiwa, H., Kikui, G.-i., Kawai, H., Jitsuhiro, T., Zhang, J.-S., Yamamoto, H., Sumita, E. and Yamamoto, S., "The atr multilingual speech-to-speech translation system", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 14, No. 2, (2006), 365-376.
35.   Qian, Y., Bi, M., Tan, T. and Yu, K., "Very deep convolutional neural networks for noise robust speech recognition", IEEE/ACM Transactions on Audio, Speech, and Language Processing,  Vol. 24, No. 12, (2016), 2263-2276.
36.   Erro, D., Moreno, A. and Bonafonte, A., "Inca algorithm for training voice conversion systems from nonparallel corpora", IEEE Transactions on Audio, Speech, and Language Processing,  Vol. 18, No. 5, (2010), 944-953.
37.   Lee, L.-s., Glass, J., Lee, H.-y. and Chan, C.-a., "Spoken content retrieval—beyond cascading speech recognition with text retrieval", IEEE/ACM Transactions on Audio, Speech, and Language Processing,  Vol. 23, No. 9, (2015), 1389-1420.
38.   Lotfazar, A., Eghtesad, M. and Mohseni, M., "Integrator backstepping control of a 5 dof robot manipulator with cascaded dynamics", IJE Trans. B: Appl,  Vol. 16, No. 4, (2003), 373-383.