ABSTRACT
Speaker recognition is a process to recognize someone by their voice. The goal of speaker recognition is to extract, characterize and recognize the information about speaker identity. In this paper, we discuss both conventional and Artificial Neural Network (ANN) approach to speaker recognition system. The proposed system comprises three main modules, a feature extraction module to extract necessary features from speech waves, a Vector Quantization (VQ) module to generate the speaker codebook and an ANN module to classify the speakers based on their high discriminative power. The proposed intelligent learning system has been applied to a case study of text-dependent speaker recognition system and the performance is evaluated by applying two types of feature extraction techniques: Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Cepstral Coefficients (LPCC). Experiment shows that the new proposed system provides significantly higher performance compare to conventional method.
- F. Nolan, The phonetic bases of speaker recognition (Cambridge CUP, Cambridge, 1983).Google Scholar
- L. Rabiner, B. Juang, Fundamentals of speech recognition (NJ: Prentice Hall, 1993). Google Scholar
Digital Library
- S. Furui, Recent advances in speaker recognition, Pattern Recognition Letters, 18: 859--872, 1997.Google Scholar
Cross Ref
- S. Furui, Cepstral analysis technique for automatic speaker verification, IEEE Transactions on Acoustics, Speech and SignalProcessing, 29(2): 254--272, 1981.Google Scholar
Cross Ref
- B. Zhen, X. Wu, Z. Liu, H. Chi, On the use of bandpass liftering in speaker recognition, Proc. 6th Int. Conf. of Spoken Lang. Processing(ICSLP), Beijing, China, 2000.Google Scholar
- Jr. J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-time Processing of Speech Signals (Macmillan Publishing Company, New York, 2000). Google Scholar
Digital Library
- R. Vergin and D. O'Shaughnessy, Pre-Emphasis and Speech Recognition, Canadian Conference on Electrical and Computer Engineering, 2: 1062--1065, 1995.Google Scholar
- X. Huang, A. Acero and H.-W. Hon, Spoken language processing: a guide to theory, algorithm, and system development (Englewood Cliffs, NJ: Prentice-Hall, 2001). Google Scholar
Digital Library
- Y. Linde, A. Buzo and R. M. Gray, R. M., An algorithm for vector quantizer design. IEEE Trans. COM-28. 84--95, 1980.Google Scholar
Cross Ref
- D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning internal representations by error propagation, Nature, 323:533--536, 1986.Google Scholar
Cross Ref
- J. He, L. Liu, G. Palm, A discriminative training algorithm for VQ-based speaker identification, IEEE Transactions on Speech and Audio Processing, 7(3): 353--356, 1999.Google Scholar
Cross Ref
- T. Kinnunen, P. Fränti, Speaker discriminative weighting method for VQ-based speaker identification, Proc. 3rd International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA): 150--156, Halmstad, Sweden, 2001. Google Scholar
Digital Library
- F. K. Soong, A. E. Rosenberg, B-H. Juang and L. R. Rabiner, A vector quantization approach to speaker recognition, AT&T Technical Journal, 66: 14--26, 1987.Google Scholar
Cross Ref
- T. Kohonen, Self-Organizing Maps (Springer-Verlag, Heidelberg, 1995). Google Scholar
Digital Library
- R. A. Finan, A. T. Sapeluk and R. I. Damper, Impostor cohort selection for score normalization in speaker verification, Pattern Recognition Letters, 18: 881--888, 1997.Google Scholar
Cross Ref
- R. Battiti, First and second-order methods for learning: between steepest descent and newton's method. Neural Computation. 4(2): 141--166.1992. Google Scholar
Digital Library
Index Terms
- Text dependent speaker verification system using discriminative weighting method and Artificial Neural Networks
Recommendations
Speaker and channel factors in text-dependent speaker recognition
We reformulate joint factor analysis so that it can serve as a feature extractor for text-dependent speaker recognition. The new formulation is based on left-to-right modeling with tied mixture HMMs and it is designed to deal with problems such as the ...
Text-independent speaker recognition using LSTM-RNN and speech enhancement
AbstractSpeaker recognition revolution has lead to the inclusion of speaker recognition modules in several commercial products. Most published algorithms for speaker recognition focus on text-dependent speaker recognition. In contrast, text-independent ...
Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
AbstractSpeaker recognition is one of several biometric recognition systems owing to its high importance in numerous applications of security and telecommunications. The key aspiration of speaker recognition systems is to know who is speaking depending on ...





Comments