Abstract
Disability is an impairment affecting an individual's livelihood and independence. Assistive technology enables the disabled cohort of the community to break the barriers to learning, access information, contribute to the community, and live independently. This article proposes an assistive device to enable people with visual disabilities and learning disabilities to access printed Arabic material in real-time, and to help them participate in the education system and the professional workforce.
This proposed assistive device employs Optical Character Recognition (OCR) and Text To Speech (TTS) conversion, using concatenation synthesis. OCR is achieved using image processing, character extraction, and classification, while Arabic speech synthesis is achieved through concatenation synthesis, followed by Multi Band Re-synthesis Overlap-Add (MBROLA). Waveform generation in the second phase produces vocal output for the disabled user to hear. OCR character and word accuracy tests were conducted for nine Arabic fonts. The results show that six fonts were recognized with over 60% character accuracy and two fonts were recognized with over 88% accuracy. A Mean Opinion Score (MOS) test for speech quality was conducted. The results showed an overall MOS score of 3.53/5 and indicated that users were able to understand the speech. A real-time usability testing was conducted with 10 subjects. The results showed an overall average of agreements scores of 3.9/5 and indicated that the proposed Arabic reader pen meets the real-time constraints and is pleasant and satisfying to use and can contribute to make printed Arabic material accessible to visually impaired persons and people with learning disabilities.
- World Health Organization. 2001. International classification of functioning, disability and health ICF. World Health Organization.Google Scholar
- World Health Organization. 2014. Fact sheet no. 352, 2014.Google Scholar
- World Health Organization. 2014. Visual impairment and blindness fact sheet N 282. World Health Organization 2014.Google Scholar
- J. Taylor. 2018. Educating students with visual impairments for inclusion in society. Amer. Found. Blind, 2000. Retrieved from http://www.afb.org/info/teachers/inclusive-education/35.Google Scholar
- T. Cavanaugh. 2002. The need for assistive technology in educational technology. AACE Rev. 10, 1 (2002), 27--31.Google Scholar
- J. Allen. 1979. MITalk-79: The 1979 MIT text-to-speech system. J. Acoust. Soc. Amer. 65, S1 (1979).Google Scholar
Cross Ref
- N. N. Akhlagi, F. Lonn, and P. Wittrup. 2003. Reading pen. United States of America Patent 6, 509 893, 21 2003.Google Scholar
- K. C. Ray and A. Rawoof. 2014. ARM based implementation of text-to-speech (TTS) for real time embedded system. In International Conference on Signal and Image Processing (ICSIP’14).Google Scholar
- S. A. Sanaki and B. B. S. 2015. Embedded based implementation of real time text-to-speech conversion. Int. J. Res. 2, 8 (2015), 339--345.Google Scholar
- M. Hamad and M. Hussain. 2011. Arabic text-to-speech synthesizer. In IEEE Student Conference on Research and Development (SCOReD’11).Google Scholar
- P. K. Bamini. 2003. FPGA-based Implementation of Concatenative Speech Synthesis Algorithm 2003.Google Scholar
- H. Tora, İ. B. Uslu, and T. Karameh. 2017. Implementation of Turkish text-to-speech synthesis on a voice synthesizer card with prosodic features. Anadolu Univ. J. Sci. Technol. A- Appl. Sci. Eng. 18, 3 (2017).Google Scholar
- RC Systems. 2006. DoubleTalk RC8660, 23 Mar 2006. Retrieved on December 2020 from https://www.rcsys.com/Downloads/rc8660.pdf.Google Scholar
- A. Chabchoub and A. Cherif. 2011. High quality Arabic concatenative speech synthesis. Sig. Image Proc. Int. J. 2 (2011).Google Scholar
- A. W. Black. 2002. Perfect synthesis for all of the people all of the time. In IEEE Workshop on Speech Synthesis.Google Scholar
Cross Ref
- J. Bachan and M. Tokarski. 2017. Creation and evaluation of MaryTTS speech synthesis for polish. In Language and Technology Conference.Google Scholar
- K. P. Sarathy and A. G. Ramakrishnan. 2008. Text to speech synthesis system for mobile applications. In Workshop in Image and Signal Processing (WISP’08).Google Scholar
- E. Vanitha, P. K. Kasarla, and E. Kuamarswamy. 2015. Implementation of text-to-speech for real time embedded system using Raspberry Pi processor. Int. J. Mag. Eng. Technol. Manag. Res. 2, 7 (2015).Google Scholar
- I. Rebai and Y. BenAyed. 2016. Arabic speech synthesis and diacritic recognition. Int. J. Speech Technol. 19, 3 (2016), 485--494.Google Scholar
Cross Ref
- D. Frontini and M. Malcangi. 2006. Neural network-based speech synthesis. In DSP Application Day.Google Scholar
- K. Lakshmi and T. C. S. Rao. 2016. Design and implementation of text to speech conversion using Raspberry Pi. Int. J. Innov. Technol. Res. 4, 6 (2016).Google Scholar
- P. Fogarassy-Neszly and C. Pribeanu. 2016. Multilingual text-to-speech software component for dynamic language identification and voice switching. In International Conference on Human-computer Interaction.Google Scholar
- Y. Wang, R. Skerry-Ryan, D. Stanton, Y. Wu, R. J. Weiss, N. Jaitly, Z. Yang, Y. Xiao, Z. Chen, S. Bengio, Q. Le, Y. Agiomyrgiannakis, R. Clark, and R. A. Saurous. 2017. Tacotron: Towards end-to-end speech synthesis. In Interspeech. 4006--4010.Google Scholar
- Yu Zhang, Ron Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, and Bhuvana Ramabhadran. 2019. Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning. In Interspeech. 2080--2084. Retrieved from 10.21437/Interspeech.2019-2668.Google Scholar
- B. Phil, S. Polansky, D. Repetto, M. Roberts, and D. Rockmore. 2011. Music and computers: A theoretical and historical approach. Preface to the Archival Version.Google Scholar
- S. Lukose and S. S. Upadhya. 2017. Text to speech synthesizer-formant synthesis. In International Conference on Nascent Technologies in Engineering (ICNTE’17).Google Scholar
- G. Toussaint. 1983. Solving geometric problems with the rotating calipers. In IEEE MELECON’83.Google Scholar
- M. I. Shamos. 1978. Computational Geometry, Yale University.Google Scholar
- R. Smith. 2007. An overview of the tesseract OCR engine. In 9th International Conference on Document Analysis and Recognition (ICDAR’07). 629--633. Retrieved from 10.1109/ICDAR.2007.4376991.Google Scholar
Cross Ref
- S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neur. Comput. J. 9 (1997).Google Scholar
- T. Zerrouki. 2014. Mishkal diacritiser. Retrieved from https://github.com/linuxscout/mishkal.Google Scholar
- F. A. Gers, J. Schmidhuber, and F. Cummins. 1999. Learning to forget: Continual prediction with LSTM. Neural Comput. 12 (1999).Google Scholar
- S. H. Al-Ani. 2014. Arabic Phonology: An Acoustical and Physiological Investigation. Walter de Gruyter.Google Scholar
- Faculte Polytechnique de Mons - TCTS lab. 1998. MBROLA voices project at Github. Retrieved from https://github.com/numediart/MBROLA-voices/tree/master/data/ar2.Google Scholar
- T. Dutoit, V. Pagel, N. Pierret, and F. Bataille. 1996. The MBROLA project: Towards a set of high quality speech synthesizers free of use for non commercial purposes. In International Conference on Spoken Language Processing (ICSLP’96).Google Scholar
- M. H. Hayes. 1998. Schaum's Outline of Digital Signal Processing. McGraw-Hill.Google Scholar
- N. Health. 2018. Raspberry Pi Zero W: The smart person's guide. TechRepublic, 2018. Retrieved from https://techrepublic.com/article/raspberry-pi-zero-wireless-the-smart-persons-guide/.Google Scholar
- M. Gibbs. 2018. Ten operating systems for the Raspberry Pi. Netw. World 3 Nov. (2014). Retrieved from https://networkworld.com/article/2842678/computers/ten-operating-systems-for-the-raspberry-pi.html.Google Scholar
- Raspi TV. 2017. How much power does pi zero w use? Retrieved from http://raspi.tv/2017/how-much-power-does-pi-zero-w-use.Google Scholar
- Adafriut. Adafriut POWERBOOST 500 CHARGER. Retrieved on December 2020 from https://adafruit.com/product/1944.Google Scholar
- F. E. A. Slimane. 2009. A new Arabic printed text image database and evaluation protocols. In 10th International Conference on Document Analysis and Recognition (ICDAR’09).Google Scholar
Digital Library
- S. Saber, A. Ahmed, A. Elsisi, and M. Hadhoud. 2016. Performance evaluation of Arabic optical. In International Conference on Advanced Intelligent Systems and Informatics (AISI’15).Google Scholar
- V. Grancharov and W. Kleijn. 2008. Speech quality assessment. In Springer Handbook of Speech Processing, Berlin, Springer, 83--100.Google Scholar
- W. B. Kleijn and K. K. Paliwal. 1995. Quality evaluation of synthesized speech. In Speech Coding and Synthesis, Elsevier Science Inc., 709--734.Google Scholar
- International Telecommunication Union. 1996. Recommendation P.800, ITU, 1996. Retrieved on December 2020 from https://www.itu.int/rec/T-REC-P.800-199608-I.Google Scholar
- eSpeak. 2020. eSpeak NG Text-To-Speech. GitHub, Inc. Retrieved from https://github.com/espeak-ng/espeak-ng.Google Scholar
- M. A. Alzubaidi and M. Otoom. 2018. Discussion-facilitator: towards enabling students with hearing disabilities to participate in classroom discussions. Int. J. Technol. Enhanc. Learn. 10, 1--2 (2018), 73--90.Google Scholar
Cross Ref
- M. Otoom and M. A. Alzubaidi. 2018. Ambient intelligence framework for real-time speech-to-sign translation. Assist. Technol. 27, 30 (2018), 119--132.Google Scholar
Cross Ref
- M. Otoom, M. A. Alzubaidi, and R. Aloufee. 2020. Novel navigation assistive device for deaf drivers. Assist. Technol. 2020 10 (2020), 1--1.Google Scholar
- T. Zerrouki, M. M. A. Shquier, A. Balla, N. Bousbia, I. Sakraoui, and F. Boudardara. 2019. Adapting eSpeak to Arabic language: Converting arabic text to speech language using eSpeak. Int. J. Reas.-based Intell. Syst. 11, 1 (2019), 76--89.Google Scholar
- Imene Zangar, Zied Mnasri, Vincent Colotte, Denis Jouvet, and Amal Houidhek. 2018. Duration modeling using DNN for Arabic speech synthesis. In 9th International Conference on Speech Prosody.Google Scholar
Cross Ref
- O. Zine and A. Meziane. 2017. Novel approach for quality enhancement of Arabic text to speech synthesis. In International Conference on Advanced Technologies for Signal and Image Processing (ATSIP’17). IEEE, 1--6.Google Scholar
- O. Zine, A. Meziane, and M. Boudchiche. 2017. Towards a high-quality lemma-based text to speech system for the Arabic language. In International Conference on Arabic Language Processing. Springer, Cham, 53--66.Google Scholar
- Amrouche Aissa, Leila Falek, and Hocine Teffahi. 2017. Design and implementation of a diacritic Arabic text-to-speech system. Int. Arab J. Inf. Technol. 14, 4 (2017).Google Scholar
- Abdelali Ahmed, Mohammed Attia, Younes Samih, Kareem Darwish, and Hamdy Mubarak. 2018. Diacritization of Maghrebi Arabic sub-dialects. ArXiv Preprint arXiv:1810.06619 (2018).Google Scholar
- S. Abed, M. Alshayeji, and S. Sultan. 2019. Diacritics effect on Arabic speech recognition. Arab. J. Sci. Eng. 44, 11 (2019), 9043--9056.Google Scholar
Cross Ref
- K. Darwish, H. Mubarak, and A. Abdelali. 2017. Arabic diacritization: Stats, rules, and hacks. In 3rd Arabic Natural Language Processing Workshop. 9--17.Google Scholar
- R. Abdelmalek and Z. Mnasri. 2016. High quality Arabic text-to-speech synthesis using unit selection. In 13th International Multi-conference on Systems, Signals & Devices (SSD’16). IEEE, 1--5.Google Scholar
- A. Alsaif, N. Albadrani, A. Alamro, and R. Alsaif. 2017. Towards intelligent Arabic text-to-speech application for disabled people. In International Conference on Informatics, Health & Technology (ICIHT’17). IEEE, 1--6.Google Scholar
- O. Abdo, S. M. Abdou, and M. Fashal. 2017. Building audio-visual phonetically annotated Arabic corpus for expressive text to speech. In INTERSPEECH, 3767--3771.Google Scholar
- I. H. Ali, Z. Mnasri, and Z. Laachri. 2019. Gemination prediction using DNN for Arabic text-to-speech synthesis. In 16th International Multi-conference on Systems, Signals & Devices (SSD’19). IEEE, 366--370.Google Scholar
- Z. Oumaima, M. Abdelouafi, and M. El Hadi. 2018. Text-to-speech technology for Arabic language learners. In IEEE 5th International Congress on Information Science and Technology (CiSt’18). IEEE, 366--370.Google Scholar
- I. Rebai and Y. BenAyed. 2016. Arabic speech synthesis and diacritic recognition. Int. J. Speech Technol. 19, 3 (2016), 485--494.Google Scholar
Cross Ref
- F. Fahmy, M. Khalil, and H. Abbas. 2020. A transfer learning end-to-end arabictext-to-speech (TTS) deep architecture. arXiv preprint arXiv:2007.11541 (2020).Google Scholar
- H. A. Elharati, M. Alshaari, and V. Z. Këpuska. 2020. Arabic speech recognition system based on MFCC and HMMs. J. Comput. Commun. 8, 03 (2020) 28.Google Scholar
Cross Ref
- H. Bouressace and J. Csirik. 2019. A convolutional neural network for Arabic document analysis. In IEEE International Symposium on Signal Processing and Information Technology (ISSPIT’19). IEEE, 1--6.Google Scholar
- M. Eltay, A. Zidouri, and I. Ahmad. 2020. Exploring deep learning approaches to recognize handwritten Arabic texts. IEEE Access 8 (2020), 89882--89898.Google Scholar
Cross Ref
- A. Arora, C. C. Chang, B. Rekabdar, B. BabaAli, D. Povey, D. Etter, D. Raj, H. Hadian, J. Trmal, P. Garcia, and S. Watanabe. 2019. Using ASR methods for OCR. In International Conference on Document Analysis and Recognition (ICDAR’19). IEEE, 663--668.Google Scholar
- H. Mohamad, S. A. Hashim, and A. H. Al-Saleh. 2019. Recognize printed Arabic letter using new geometrical features. Indon. J. Electr. Eng. Comput. Sci. 14, 3 (2019), 1518--1524.Google Scholar
- K. Mohammad, A. Qaroush, M. Ayesh, M. Washha, A. Alsadeh, and S. Agaian. 2019. Contour-based character segmentation for printed Arabic text with diacritics. J. Electron. Imag. 28, 4 (2019), 043030.Google Scholar
Cross Ref
- M. E. Mustafa and M. K. Elbashir. 2020. A deep learning approach for handwritten Arabic names recognition. Int. J. Adv. Comput. Sci. Applic. 11, 1 (2020).Google Scholar
- A. Qaroush, B. Jaber, K. Mohammad, M. Washaha, E. Maali, and N. Nayef. 2019. An efficient, font independent word and character segmentation algorithm for printed Arabic text. J. King Saud Univ.-Comput. Inf. Sci. DOI:https://doi.org/10.1016/j.jksuci.2019.08.013Google Scholar
- I. S. Al-Sheikh, M. Mohd, and L. Warlina. 2020. A review of arabic text recognition dataset. Asia-Pac. J. Inf. Technol. Multimedia 9, 1 (2020), 69--81.Google Scholar
Cross Ref
- T. Milo and A. G. Martínez. 2019. A new strategy for Arabic OCR: Archigraphemes, letter blocks, script grammar, and shape synthesis. In 3rd International Conference on Digital Access to Textual Cultural Heritage (DATeCH’19). Association for Computing Machinery, New York, NY, 93--96. 2019. DOI:https://doi.org/10.1145/3322905.3322928Google Scholar
Digital Library
- S. M. Darwish and K. O. Elzoghaly. 2020. An enhanced offline printed Arabic OCR model based on bio-inspired fuzzy classifier. IEEE Access 8 (2020), 117770--117781.Google Scholar
Cross Ref
- M. Kadi and M. Nasri. 2019. Isolated Arabic characters recognition using a robust method against noise and scaling based on the «hough transform». Int. J. Inf. Sci. Technol. 3, 4 (2019), 34--43.Google Scholar
- W. N. Hussein and H. N. Hussain. 2019. A design of a hybrid algorithm for optical character recognition of online hand-written Arabic alphabets. Iraqi J. Sci. 60, 9 (2019), 2067--2079.Google Scholar
Cross Ref
- M. W. Ok and K. Rao. 2017. Using a digital pen to support secondary students with learning disabilities. Interv. School Clin. 53, 1 (2017), 36--43.Google Scholar
Cross Ref
- Wizcomtech. 2020. The freedom to read. Retrieved from https://www.wizcomtech.com.Google Scholar
- C-Pen. 2020. The original pen scanner brand. Retrieved from https://cpen.com/.Google Scholar
- IRISPen. 2020. The digital highlighter that types what you scan! Retrieved from https://www.irislink.com/EN-JO/c1708/IRISPen-Air-7—Portable-Digital-Highlighter.aspx.Google Scholar
- WorldPenScan X. Entry & Translation Retrieved on December 2020 from http://www.penpowerinc.com/product.asp?sn=735.Google Scholar
- Livescribe. A pen for every occasion. Retrieved on December 2020 from https://us.livescribe.com/collections/smartpens.Google Scholar
- K. C. Huang, C. K. Sun, D. Y. Huang, Y. C. Chen, R. C. Chang, S. W. Hsu, C. Y. Yang, and B. Y. Chen. 2020. Glissade: Generating balance shifting feedback to facilitate auxiliary digital pen input. In CHI Conference on Human Factors in Computing Systems, 1--13.Google Scholar
- C. M. Chen, J. Y. Wang, and M. Lin. 2019. Enhancement of English learning performance by using an attention-based diagnosing and review mechanism in paper-based learning context with digital pen support. Univ. Access Inf. Soc. 18, 1 (2019), 141--153.Google Scholar
Digital Library
- C. M. Chen, C. C. Tan, and B. J. Lo. 2016. Facilitating English-language learners’ oral reading fluency with digital pen technology. Interact. Learn. Environ. 24, 1 (2016), 96--118.Google Scholar
Cross Ref
- C. C. Tan, C. M. Chen, and H. M. Lee. 2020. Effectiveness of a digital pen-based learning system with a reward mechanism to improve learners’ metacognitive strategies in listening. Comput. Assist. Lang. Learning. 33, 7 (2020), 1--26.Google Scholar
Cross Ref
- N. Choi, S. Kang, and J. Sheo. 2020. Children's interest in learning English through picture books in an EFL context: The effects of parent--child interaction and digital pen use. Educ. Sci. 10, 2 (2020), 40.Google Scholar
Cross Ref
- P. Krish. 2020. The use of the audio pen in enhancing reading skills among preschool children. Int. J. Inf. Educ. Technol. 10, 5 (2020).Google Scholar
Index Terms
Real-time Assistive Reader Pen for Arabic Language
Recommendations
Arabic Reading Machine for Visually Impaired People Using TTS and OCR
ISMS '13: Proceedings of the 2013 4th International Conference on Intelligent Systems, Modelling and SimulationThis paper suggests a standalone Arabic Reading Machine using TTS (Text-to-speech) and OCR (Optical Character Recognition) software built in a user friendly way for Visually Impaired People. In the Arab world, the assistive reading technology for ...
Online Recognition System for Handwritten Arabic Chemical Symbols
ICCCE '14: Proceedings of the 2014 International Conference on Computer and Communication EngineeringArabic chemical symbols are remarkably different from Latin chemical symbols which written by Arabic characters. On the other hand, Arabic chemical symbols follow Latin chemical symbols from the structure of writing the symbols. Although, Arabic symbols ...






Comments