voice morphing seminar report

15-02-2012, 11:13 AM

hello, i am in dire need of a seminar report on voice morphing.my seminar is on 17th feb.
i would truly appreciate it if you could assist me. thanks a lot!

**seminar ideas** · 16-04-2012, 10:15 AM

to get information about the topic "VOICE MORPHING" full report ppt and related topic refer the link bellow

https://seminarproject.net/Thread-voice-...ull-report

https://seminarproject.net/Thread-voice-morphing--823

https://seminarproject.net/Thread-voice-...act?page=3

https://seminarproject.net/Thread-voice-...act?page=5

**seminar flower** · 17-09-2012, 01:18 PM

Voice Morphing

.ppt

Voice Morphing.ppt (Size: 555 KB / Downloads: 51)

What is Voice Morphing ?

Voice morphing is a technique for modifying a (source) speaker's speech to sound as if it were spoken by a different (target) speaker.
In Simpler terms it is being able to change the speech of one speaker to that of another speaker.
Applications for Voice Morphing range from recreational ones to security ones.

How to Morph Voice ?

We need to effectively change the pitch from that of a male speaker to that of a female speaker. If we reminisce the excitation signal has information about the speaker.
We find the LPC coefficients for the Source and Target Signals and using these coefficients we are going to interpolate between the two Signals.
We get the New LPC coefficients using the formula

Applications

In public speech systems we can make the sound to be of a popular public speaker. We can implement that in many places like railway announcements.
Video and image morphing is extensively used for film and graphical special effects.
In text to speech system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcription into speech.

Limitations

Voice detection is done via sophisticated 3d rendering but there are a lot of normalizing problems.
Some applications require extensive sound libraries.
The different langauge requires different phonetics and thus updating or extending is tedious.
It is very seldom complete (we may not be able add every small talk, every phonetics into the database.

**study tips** · 29-08-2013, 03:15 PM

Voice Morphing Seminar Report

.doc

Voice Morphing .doc (Size: 281 KB / Downloads: 17)

Introduction

Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals, while generating a smooth transition between them. Speech morphing is analogous to image morphing. In image morphing the in-between images all show one face smoothly changing its shape and texture until it turns into the target face. It is this feature that a speech morph should possess. One speech signal should smoothly change into another, keeping the shared characteristics of the starting and ending signals but smoothly changing the other properties. The major properties of concern as far as a speech signal is concerned are its pitch and envelope information. These two reside in a convolved form in a speech signal. Hence some efficient method for extracting each of these is necessary. We have adopted an uncomplicated approach namely cepstral analysis to do the same. Pitch and formant information in each signal is extracted using the cepstral approach. Necessary processing to obtain the morphed speech signal include methods like Cross fading of envelope information, Dynamic Time Warping to match the major signal features (pitch) and Signal Re-estimation to convert the morphed speech signal back into the acoustic waveform.

An Introspection of the Morphing Process

We had undertaken this work, which sounded quite challenging and interesting. We were eager to know whether a venture like speech morphing will be feasible using the cepstral approach. Processes like cepstral analysis and the re estimation of the morphed speech signal into an acoustic waveform involve much intricacy and challenge. Also this project digs deep into the basics of digital signal processing or speech processing rather. This project covers a lot of ground as far as speech processing is concerned.
Speech morphing can be achieved by transforming the signal’s representation from the acoustic waveform obtained by sampling of the analog signal, with which many people are familiar with, to another representation. To prepare the signal for the transformation, it is split into a number of 'frames' - sections of the waveform. The transformation is then applied to each frame of the signal. This provides another way of viewing the signal information. The new representation (said to be in the frequency domain) describes the average energy present at each frequency band.

Morphing Process: A Comprehensive Analysis

The algorithm to be used is shown in the simplified block diagram given below. The algorithm contains a number of fundamental signal processing methods including sampling, the discrete Fourier transform and its inverse, cepstral analysis. However the main processes can be categorized as follows.
I. Preprocessing or representation conversion: This involves processes like signal acquisition in discrete form and windowing.
II. Cepstral analysis or Pitch and Envelope analysis: This process will extract the pitch and formant information in the speech signal.
III. Morphing which includes Warping and interpolation.
IV. Signal re-estimation.

Acoustics of speech production

Speech production can be viewed as a filtering operation in which a sound source excites a vocal tract filter. The source may be periodic, resulting in voiced speech, or noisy and a periodic, causing unvoiced speech. As a periodic signal, voiced speech has a spectra consisting of harmonics of the fundamental frequency of the vocal cord vibration; this frequency often abbreviated as F0, is the physical aspect of the speech signal corresponding to the perceived pitch. Thus pitch refers to the fundamental frequency of the vocal cord vibrations or the resulting periodicity in the speech signal. This F0 can be determined either from the periodicity in the time domain or from the regularly spaced harmonics in the frequency domain.
The vocal tract can be modeled as an acoustic tube with resonances, called formants, and anti resonances. (The formants are abbreviated as F1, where F1 is the formant with the lowest center frequency.) Moving certain structures in the vocal tract alters the shape of the acoustic tube, which in turn changes its frequency response. The filter amplifies energy at and near formant frequencies, while attenuating energy around anti resonant frequencies between the formants.

Signal Acquisition

Before any processing can begin, the sound signal that is created by some real-world process has to be ported to the computer by some method. This is called sampling. A fundamental aspect of a digital signal (in this case sound) is that it is based on processing sequences of samples. When a natural process, such as a musical instrument, produces sound the signal produced is analog (continuous-time) because it is defined along a continuum of times. A discrete-time signal is represented by a sequence of numbers - the signal is only defined at discrete times. A digital signal is a special instance of a discrete-time signal - both time and amplitude are discrete. Each discrete representation of the signal is termed a sample.

Windowing

A DFT (Discrete Fourier Transformation) can only deal with a finite amount of information. Therefore, a long signal must be split up into a number of segments. These are called frames. Generally, speech signals are constantly changing and so the aim is to make the frame short enough to make the segment almost stationary and yet long enough to resolve consecutive pitch harmonics. Therefore, the length of such frames tends to be in the region of 25 to 75 milli seconds.
Morphing

Matching and Warping: Background theory

Both signals will have a number of 'time-varying properties'. To create an effective morph, it is necessary to match one or more of these properties of each signal to those of the other signal in some way. The property of concern is the pitch of the signal - although other properties such as the amplitude could be used - and will have a number of features. It is almost certain that matching features do not occur at exactly the same point in each signal. Therefore, the feature must be moved to some point in between the position in the first sound and the second sound. In other words, to smoothly morph the pitch information, the pitch present in each signals needs to be matched and then the amplitude at each frequency cross-faded. To perform the pitch matching, a pitch contour for the entire signal is required. This is obtained by using the pitch peak location in each cepstral pitch slice.

Morphing Stage

Now we shall give a detailed account of how the morphing process is carried out. The overall aim in this section is to make the smooth transition from signal 1 to signal 2. This is partially accomplished by the 2D array of the match path provided by the DTW. At this stage, it was decided exactly what form the morph would take. The implementation chosen was to perform the morph in the duration of the longest signal. In other words, the final morphed speech signal would have the duration of the longest signal. In order to accomplish this, the 2D array is interpolated to provide the desired duration.
However, one problem still remains: the interpolated pitch of each morph slice. If no interpolation were to occur then this would be equivalent to the warped cross-fade which would still be likely to result in a sound with two pitches. Therefore, a pitch in- between those of the first and second signals must be created. The precise properties of this manufactured pitch peak are governed by how far through the morph the process is. At the beginning of the morph, the pitch peak will take on more characteristics of the signal 1 pitch peak - peak value and peak location - than the signal 2 peak. Towards the end of the morph, the peak will bear more resemblance to that of the signal 2 peaks. The variable l is used to control the balance between signal 1 and signal 2. At the beginning of the morph, l has the value 0 and upon completion, l has the value 1. Consider the example in Figure 4.6. This diagram shows a sample cepstral slice with the pitch peak area highlighted. Figure 4.7 shows another sample cepstral slice, again with the same information highlighted. To illustrate the morph process, these two cepstral slices shall be used.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	beauty parlour management system project report	Guest	1	7,445	07-04-2020, 11:06 PM Last Post:
	polytronics seminar	Guest	4	19,120	28-09-2018, 07:46 PM Last Post: [email protected]
	hotel management system project report	Guest	6	6,160	06-09-2018, 07:41 PM Last Post: Snehal shivaji pawar
	voice based email for blinds a project	Guest	6	13,353	14-10-2017, 11:29 AM Last Post: jaseela123
	pvc coating seminar report	Guest	1	4,058	22-09-2017, 11:00 AM Last Post: jaseela123
	li fi seminar	Guest	1	1,294	21-09-2017, 03:16 PM Last Post: jaseela123
	google wave protocol seminar report	Guest	1	8,173	21-09-2017, 12:39 PM Last Post: jaseela123
	laboratory management system project report in vb	Guest	1	3,430	20-09-2017, 02:56 PM Last Post: jaseela123
	seminar artificial leaf	Guest	1	2,343	19-09-2017, 04:09 PM Last Post: jaseela123
	ieee seminar topics for information technology	Guest	1	7,262	18-09-2017, 04:17 PM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.