Voice Transformation using Pitch and Spectral Mapping

Publication Type:

Conference Proceedings


5th Intl. Symposium on Women in Computing and Informatics (WCI’17), Co-affiliated with 6th Intl. Conference on advances in Computing, Communications and Informatics (ICACCI 2017), IEEE (2017)




Dept. of Electronics and communication Engineering.


This paper provides a voice transformation model that uses pitch data and Feed-forward Neural Networks on Line Spectral Frequency. The aim of this work is to achieve the transformation of a speech signal produced by a source speaker by modifying voice individuality parameters such that it appears to be spoken by a chosen target speaker, without modifying the message contents. Most of the previous work on voice conversion does not compensate for spectral detail losses, nonlinearities in speech or interaction between excitation and vocal tract. This causes over smoothing and invariably leads to a lack of similarity between desired and converted speech. The key contribution of this paper is a unique choice of the Linear Prediction Coefficients (LPC) converted to Line Spectral Frequencies (LSF). Neural networks is used for mapping. The performances of these Voice Conversion structures are evaluated by subjective and objective measures which validate the efficiency of the conversion style. The result shows an improvement in recognition rates by a factor of 10% in male to female conversion and a factor of 20% in female to male conversion in ABX tests over the conversion using only LPC coefficients. The novelty of this paper lies in the usage of pitch information to facilitate a holistic conversion model. A recognition rate of 90% for female to male conversion and 80% for male to female conversion has been achieved after incorporating pitch details.%P 1540 - 1544