Design of an Intelligent Speaker Recognition System using Mel Frequency Cepstrum Coefficients and Vector Quantization for Biometric Authentication

Abstract
Authors
Keywords
Conclusion
References

This paper gives an overview of automatic speaker recognition technology for biometric authentication. A person can be identified by various characteristics such as signature, fingerprints, voice, facial features, etc. This type of authentication methods is known as biometric person authentication. Speaker recognition refers to the process of automatically recognizing who is speaking on the basis of individual information included in speech waves. For a reliable and high accuracy of speech recognition, simple and efficient representation methods are required. In this paper, coefficients are extracted from incoming speech signal using MFCC and it represent trained vector of the speaker. Vector Quantization is the technique used for identification. To identify the speaker, the Euclidean distance between the acoustic vector of test input signal and the mapped codebook is calculated. The trained vector that produces the smallest Euclidean distance is identified as speaker.

Published In : IJCSN Journal Volume 4, Issue 6

Date of Publication : December 2015

Pages : 873- 886

Figures :04

Tables : 01

Publication Link : Design of An Intelligent Speaker Recognition System using Mel Frequency Cepstrum Coefficients and Vector Quantization for Biometric Authentication

Sreelakshmi V. : completed her B.Tech in Electronics & Communication Engineering under Mahatma Gandhi University. Currently she is pursuing M.Tech in Electronics with specialization in VLSI and Embedded System under Cochin University of Science and Technology (CUSAT).

Dr. Gnana Sheela K : received her Ph D in Electronics & S Communication from Anna University, Chennai. She is working as a Professor, Department of ECE at TocH Institute of Science and Technology. She has published 20 international journal papers. She is a life member of ISTE.

Mel Frequency Cepstrum Coefficients (MFCC)

Fast Fourier Transform (FFT)

Mel Filter Bank

Windowing techniques

Euclidean Distance

Vector Quantization (VQ)

The automatic speaker recognition system consists of 2 phases: enrollment and testing phase. In the enrollment phase, a database of 8 speakers were created and stored in as a reference. The set of speaker’s voice samples are trained using MFCC and Vector Quantization. These feature vectors are stored as reference models. In the testing phase, the unknown speaker’s identity is matched against the reference models and the recognition is made. Speaker identification and verification are simulated and the results are verified. Simulation is completed using MATLAB 2013a. Speaker recognitions accuracy of 100% was obtained for a set of 8 pre-recorded speakers.

[1] Yoseph Linde, Andres Buzo and Robert M. Gray, “An Algorithm for Vector Quantizer Design”, IEEE Transactions on Communications, Vol.28, No.1, pp.84-95, January 1980. [2] E. H. Wrench, “A Realtime Implementation of a Text Independent Speaker Recognition System”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.6, pp.193-196, April 1981. [3] Jialong He, Li Liu, and Gunther Palm, “A New Codebook Training Algorithm for VQ-based Speaker Recognition”, In proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.2, pp.1091-1094, April 1997. [3] Izuan Hafez Ninggal & Abdul Manan Ahmad, “The Fundamental of Feature Extraction in Speaker Recognition : A Review”, In proc. of the Postgraduate Annual Research Seminar, pp.435-439, 2006. [4] Ali Zulfiqar, Aslam Muhammad and Martinez Enriquez A. M., “A Speaker Identification System using MFCC Features with VQ Technique”, In proc. of Third International Symposium on Intelligent Information Technology Application, Vol.3, pp.115-118, November 2009. [5] Li Shaomei, Guo Yunfei, Wei Hongquan “Speaker Recognition via Statistics of Acoustic Feature Distribution”, In proc of International Conference on Multimedia Information Networking and Security, Vol.2, pp.190-192, 2009. [6] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, In proc of Journal of Computing, Vol.2, No.3, pp.138-143, March 2010. [7] Vibha Tiwari, “MFCC and its applications in speaker recognition”, In proc of International Journal on Emerging Technologies, Vol.1, pp.19-22, February 2010. [8] Zhiyi Qu, Jing Yu, and Qiang Niu, “Pornographic Audios Detection Using MFCC Features and Vector Quantization”, In proc of IEEE International Conference on Computational and Information Sciences, pp.924-927, December 2010. [9] Yuan Yujin, Zhao Peihua and Zhou Qun, “Research of Speaker Recognition Based on Combination of LPCC and MFCC”, In proc of IEEE International Conference on Intelligent Computing and Intelligent Systems, Vol.3, pp.765-767, October 2010. [10] M.Hassan Shirali Shahreza and Sajad Shirali Shahreza, “Effect of MFCC Normalization on Vector Quantization Based Speaker Identification”, In proc of IEEE International Symposium on Signal Processing and Information Technology(ISSPIT), pp.250-253, December 2010. [11] Tiwalade O. Majekodunmi and Francis E. Idachaba, “A Review of the Fingerprint, Speaker Recognition, Face Recognition and Iris Recognition Based Biometric Identification Technologies”, In proc of the World Congress on Engineering, Vol.2, pp.1681-1687, July 2011. [12] Danko Komlen, Tomislav Lombarovic, Mario Ogrizek Tomas, Denis Petek and Andrej Petkovic, “Text Independent Speaker Recognition Using LBG Vector Quantization”, In proc of the 34th International Convention MIPRO, pp.1652-1657, May 2011. [13] Supriya Tripathi and Smriti Bhatnagar, “Speaker Recognition”, In proc of Third International Conference on Computer and Communication Technology, pp.283-287, November 2012. [14] Jorge Martinez, Hector Perez, Enrique Escamilla and Masahisa Mabo Suzuki, “Speaker recognition using Mel Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) Techniques”, In proc of IEEE International Conference on Electrical Communications and Computers, pp.248-251, February 2012. [15] M. G. Sumithra and A. K. Devika, “A Study on Feature Extraction Techniques for Text Independent Speaker Identification”, In proc of IEEE International Conference on Computer Communication and Informatics (ICCCI), pp.1-5, January 2012. [16] Amruta A. Malode and Shashikant L. Sahare, “An Improved Speaker Recognition by Using VQ & HMM”, In proc of IEEE Third International Conference on Sustainable Energy and Intelligent System (SEISCON), pp.377-383, December 2012. [17] Fatma zohra Chelali and Amar Djeradi, “MFCC and vector quantization for Arabic fricatives Speech/Speaker recognition”, In proc of IEEE International Conference on Multimedia Computing and Systems, pp.284-289, May 2012. [18] Dr. H B Kekre, Dr. V A Bharadi, A R Sawant, Onkar Kadam, Pushkar Lanke and Rohit Lodhiya, “Speaker Recognition using Vector Quantization by MFCC and KMCG Clustering Algorithm”, In proc of IEEE International Conference on Communication, Information & Computing Technology (ICCICT), pp.1-5, October 2012. [19] N. N. Lokhande, N. S. Nehe and P. S. Vikhe, “MFCC Based Robust Features for English Word Recognition”, In proc of Annual IEEE India Conference (INDICON), pp.798-801, December 2012. [20] Genevieve I. Sapijaszko and Wasfy B. Mikhael, “An Overview of Recent Window Based Feature Extraction Algorithms for Speaker Recognition”, In proc of IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS), pp.880-883, August 2012. [21] A. S. Bhalerao and V. B. Malode, “Implementation of Automatic Speaker Recognition on TMS320C6713 Using MFCC”, In proc of IEEE International Conference on Computer Communication and Informatics (ICCCI), pp.1-4, January 2013. [22] Mahmoud I. Abdalla, Haitham M. Abobakr and Tamer S. Gaafar, “DWT and MFCCs based Feature Extraction Methods for Isolated Word Recognition”, In proc of International Journal of Computer Applications, Vol.69, No.20, pp.21-26, May 2013. [23] Nisha.V. S. and M. Jayasheela, “Survey on Feature Extraction and Matching Techniques for Speaker Recognition Systems”, In proc of International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE), Vol.2, No.3, pp.357-362, March 2013. [24] Shivam Jain, Preeti Jha and Suresh R., “Design and Implementation of an Automatic Speaker Recognition System using neural and fuzzy logic in Matlab”, In proc of IEEE International Conference on Signal Processing and Communication (ICSC), pp.319-324, December 2013. [25] Shahzadi Farah and Azra Shamim, “Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients, Linear Prediction Coding and Vector Quantization”, In proc of IEEE 3rd International Conference on Computer, Control and Communication (IC4), pp.1-5, September 2013. [26] Rishiraj Mukherjee, Tanmoy Islam, and Ravi Sankar, “Text Dependent Speaker Recognition Using Shifted MFCC”, In proc of IEEE Southeastcon, pp.1-4, March 2013. [27] Liu Ting-ting and Guan Sheng-xiao, “On Text-independent Speaker Recognition via Improved Vector Quantization Method”, In proc of the 32nd Chinese Control Conference, pp.3912-3916, July 2013. [28] Amit Kumar Singh, Rohit Singh, Ashutosh Dwivedi, “Mel Frequency Cepstral Coefficients Based Text Independent Automatic Speaker Recognition Using Matlab”, In proc of International Conference on Reliability, Optimization and Information Technology (ICROIT), pp.524-527, February 2014. [29] Zhu Jianchen and Liu Zengli, “Analysis of Hybrid Feature Research Based on Extraction LPCC and MFCC”, In proc of Tenth International Conference on Computational Intelligence and Security, pp.732-735, Nov.15-16,2014. [30] Riadh Ajgou, SalimSbaa, Said Ghendir, Ali Chamsa and A. Taleb-Ahmed, “Robust Remote Speaker Recognition System Based on AR-MFCC features and Efficient Speech activity detection Algorithm”, In proc of 11th International Symposium on Wireless Communications Systems (ISWCS), pp.722-727, August 2014. [31] Mandeep Singh Walia,” Discrete Fractional Fourier Transform and Vector Quantization Based Speaker Identification System”, In proc of Fourth International Conference on Advanced Computing & Communication Technologies, pp.459-463, February 2014. [32] Milind U Nemade and Satish K shah, “Real Time Speech Recognition Using DSK TMS320C6713”, In proc of International Journal of Advanced Research in Computer Science and Software Engineering(IJARCSSE), Vol.4, No.1, pp.461-469, January 2014. [33] Shanthi Therese S., Chelpa Lingam, “Speaker based Language Independent Isolated Speech Recognition System”, International Conference on Communication, Information & Computing Technology (ICCICT), pp. 1-7,January 2015. [34] Campbell, J.P., Jr.; “Speaker recognition: a tutorial” In proc of the IEEE, Vol.85, No.9, pp 1437 – 1462, Sept. 1997. [35] Sadaoki Furui, “An overview of Speaker Recognition Technology”, In proc of ECSA workshop on Automatic Speaker Recognition, Identification and Verification, pp.1-9, 1994. [36]Aleksandra Babich, “Biometric Authentication. Types of biometric identifiers”, Bachelor’s Thesis Degree Programme in Business Information Technology 2012. [37] Sujatha K., Nageswara Rao P.V., Rao A.A., Prasad K.R. and Deepthi M.S.B., “Biometric Identity Verification Using Automatic Speaker Recognition”, In proc of IEEE International Conference on Electrical, Electronics Signals, Communication and Optimization (EESCO), pp.1-5, January 2015. [38]http://www.usfst.com/article/Demystifying-Voice-Biometrics--The-Future-of-Security-is-Available-Today/ cited 21.04.2012. [39]http://www.technewsworld.com/story/59728.html cited 13.04.2012. [40] John G. Proakis and Dimitris G. Manolakis, “Digital Signal Processing”, New Delhi: Prentice Hall of India. 2002. [41] Seddik, H.; Rahmouni, A.; Sayadi, M.; “Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier”, In proc of First International Symposium on Control, Communications and Signal Processing, Proceedings of IEEE, pp 631 – 634, 2004. [42] Y. Linde, A. Buzo & R. Gray, “An algorithm for vector quantizer design”, In proc of IEEE Transactions on Communications, Vol. 28, No.1, pp.84-95, Jan 1980. [43] Moureaux, J.M., Gauthier P, Barlaud, M and Bellemain P.”Vector quantization of raw SAR data”, In proc of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.5, pp 189 -192, April 1994.