Significance of Glottal Closure Instants detection algorithms in Vocal Emotion Conversion

TitleSignificance of Glottal Closure Instants detection algorithms in Vocal Emotion Conversion
Publication TypeBook Chapter
Year of Publication2017
AuthorsVekkot S, Tripathi S.
Book TitleSoft Computing Applications
PublisherSpringer International Publishing
KeywordsDept. of Electronics and communication Engineering.

The objective of this work is to explore the significance of efficient glottal activity detection for inter-emotion conversion. Performance of popular glottal epoch detection algorithms like Dynamic Projected Phase-Slope Algorithm (DYPSA), Speech Event Detection using Residual Excitation And a Mean-based Signal (SEDREAMS) and Zero Frequency Filtering (ZFF) are compared in the context of vocal emotion conversion. Existing conversion approaches deal with synthesis/conversion from neutral to different emotions. In this work, we have demonstrated the efficacy of determining the conversion parameters based on statistical values derived from multiple emotions and using them for inter-emotion conversion in Indian context. Pitch modification is effected by using transformation scales derived from both male and female speakers in IIT Kharagpur-Simulated Emotion Speech Corpus. Three archetypal emotions viz. anger, fear and happiness were generated using pitch and amplitude modification algorithm. Analysis of statistical parameters for pitch after conversion revealed that anger gives good subjective and objective similarity while characteristics of fear and happiness are most challenging to synthesise. Also, use of male voice for synthesis gave better intelligibility. Glottal activity detection by ZFF gave results with least error for median pitch. The results from this study indicated that for emotions with overlapping characteristics like surprise and happiness, inter-emotion conversion can be a better choice than conversion from neutral.