Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: High Quality Voice Morphing
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
High Quality Voice Morphing

[attachment=23863]

What is Voice Morphing?

Voice morphing is a technique for modifying a source speaker’s speech to
sound as if it was spoken by some designated target speaker.
 Research Goals: To develop algorithms which can morph speech from
one speaker to another with the following properties.
1. High quality ( natural and intelligible )
2. Morphing function can be trained automatically from speech data
which may or may not require the same utterances to be spoken by
the source and target speaker.
3. the ability to operate with target voice training data ranging from a
few seconds to tens of minutes.


Key Technical Issues

1. Mathematical Speech Model
 For speech signal representation and modification
2. Accoustic Feature
 For speaker identification
3. Conversion Function
 Involves methods for training and application



Time and Pitch Modification using PSHM

1. Pitch Modification
 It is essential to maintain the spectral structure while altering the
fundamental frequency.
 Achieved by modifying the excitation components whilst keeping the
original spectral envelope unaltered.
2. Time Modification
 PSHM model allows the analysis frames be regarded as phase-
independent units which can be arbitrarily discarded, copied and
modified.


Unnatural Phase Dispersion

In the baseline system, the converted spectral envelope was combined
with the original phases. This results in converted speech with a
“harsh” quality.
 Spectral magnitudes and phases of human speech are highly correlated.
 To simultaneously model the magnitudes and phases and then convert
them both via a single unified transform is extremely difficult.


Transforming Unvoiced Sounds
 Theoretically the unvoiced sounds contain very little vocal tract
information, so in our baseline system, the unvoiced sounds are not
transformed.
 In reality, many unvoiced sounds have some vocal tract colouring which
affects the speech characteristics.
 Since the spectral envelopes of the unvoiced sounds have large
variations, it is not effective to convert them using the linear
transformation scheme.
 An simple approach based on unit selection and concatenation was
therefore developed to transform the unvoiced sounds.