Journal of Computer Science


Thiruvengatanadhan Ramalingam and P. Dhanalakshmi

DOI : 10.3844/jcssp.2014.34.44

Journal of Computer Science

Volume 10, Issue 1

Pages 34-44


Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT) as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM) are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM) to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.


© 2014 Thiruvengatanadhan Ramalingam and P. Dhanalakshmi. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.