Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Isolated-word speech recognition using hidden Markov models pdf
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Isolated-word speech recognition using hidden Markov models

[attachment=60332]

Introduction

Speech recognition is a challenging problem on which much work has been done the last
decades. Some of the most successful results have been obtained by using hidden Markov
models as explained by Rabiner in 1989 [1].
A well working generic speech recognizer would enable more efficient communication
for everybody, but especially for children, analphabets and people with disabilities. A
speech recognizer could also be a subsystem in a speech-to-speech translator.
The speech recognition system implemented during this project trains one hidden
Markov model for each word that it should be able to recognize. The models are trained
with labeled training data, and the classification is performed by passing the features to
each model and then selecting the best match.

Background theory

Hidden Markov models


Basic knowledge of hidden Markov models is assumed, but the two most important
algorithms used in this project will be described.
The observable output from a hidden state is assumed to be generated by a mul-
tivariate Gaussian distribution, so there is one mean vector and covariance matrix for
each state. We will also assume that the state transition probabilities are independent
of time, such that the hidden Markov chain is homogenous.
We will now define the notation for describing a hidden Markov model as used in
this project. There is a total number of N states. An element ass in the transition
probability matrix A denotes the transition probability from state s to state s , and
the probability for the chain to start in state s is πs . The mean vector and covariance
matrix for the multivariate Gaussian distribution modeling the observable output from
state s are μs and Σs , respectively. For an observation o, bs (o) denotes the probability
density of the multivariate Gaussian distribution of state s at the values of o. We will
sometimes denote the collection of parameters describing the hidden Markov model as
λ = {A, π, μ, Σ}.

Conclusion and future work

During this project a system for isolated-word speech recognition was implemented and
tested. The cross-validation results are good for a single speaker. Two obvious extensions
are better support for several speakers, and support for continuos speech. The first step
towards the former would be more, and more robust, features.