Speech Segmentation Using Dynamic Windows and Thresholds for Arabic and English Languages

Yahia Hasan Jazyah

doi:10.3844/jcssp.2018.485.490

Research Article Open Access

Speech Segmentation Using Dynamic Windows and Thresholds for Arabic and English Languages

Yahia Hasan Jazyah¹

¹ Arab Open University, Kuwait

Abstract

Segmentation of audio data such as human speech (splitting each word in separate audio file – .WAV file) has been a major concern when working with multimedia such as recordings from radio or TV. The main focus of the segmentation of boundaries of spoken language has been on using energy and zero crossing thresholds for endpoint detection. Errors in endpoint detection are still a main cause of low accuracy of segmentation systems. The goal of this research is to develop an efficient algorithm in order to segment the speech of human in both languages of English and Arabic in different speaking speed with high accuracy. Simulation results show that the developed algorithm achieved high accuracy when segmenting human speech in English language up to 91.6% in average, while it is 89.0% of Arabic language.

Journal of Computer Science

Volume 14 No. 4, 2018, 485-490

DOI: https://doi.org/10.3844/jcssp.2018.485.490

Submitted On: 4 January 2018 Published On: 18 April 2018

How to Cite: Jazyah, Y. H. (2018). Speech Segmentation Using Dynamic Windows and Thresholds for Arabic and English Languages. Journal of Computer Science, 14(4), 485-490. https://doi.org/10.3844/jcssp.2018.485.490

Copyright: © 2018 Yahia Hasan Jazyah. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

6,617 Views
3,567 Downloads
5 Citations

Download

Keywords

Audio
Voice
Speech
Segmentation