Research Article Open Access

CASRA+: A Colloquial Arabic Speech Recognition Application

Ramzi A. Haraty and Omar El Ariss

Abstract

The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen was a phoneme-based model. This reference model differs from the direct word template matching, where speech features that are extracted from the input are directly compared to the word templates. Each word template in the direct matching model was stored as a vector of feature parameters. Thus, when the vocabulary size of the ASR system becomes large, the memory size for the word template will become humongous. In contrast, the model used here was phoneme-like template matching. Word templates are stored as phoneme-like template parameters. Thus, the memory size for the word templates will not grow as fast as that of the direct matching model.

American Journal of Applied Sciences
Volume 4 No. 1, 2007, 23-32

DOI: https://doi.org/10.3844/ajassp.2007.23.32

Submitted On: 7 September 2006 Published On: 30 April 2007

How to Cite: Haraty, R. A. & El Ariss, O. (2007). CASRA+: A Colloquial Arabic Speech Recognition Application. American Journal of Applied Sciences, 4(1), 23-32. https://doi.org/10.3844/ajassp.2007.23.32

  • 3,544 Views
  • 2,455 Downloads
  • 18 Citations

Download

Keywords

  • Arabic language
  • speech recognition
  • template matching