31-07-2013, 12:55 PM
An Assistive Body Sensor Network Glove for Speech and Hearing Impaired Disabilities
Assistive Body Sensor.docx (Size: 1.07 MB / Downloads: 31)
INTRODUCTION
Communication is an essential element of human life. It is a process that helps generating mutual understanding and enabling us to live in harmony. For speech and hearing impaired disabilities, on-verbal form of communication is very important. This type of communication exists in various forms, ranging from facial expression, lip motion, sign language, to hand writing.
In particular, sign language, such as body or hand gestures, is one of the most natural means of communication among deaf and mute people. Since they cannot perceive or generate the acoustic information, exchanging information is therefore usually made visually. In general, most hearing people do not have knowledge about specific sign languages. Everyday communication with hearing population, therefore, poses a major challenge to those with disabilities. To alleviate the communication problem among speech-impaired disabilities, an assistive technology that will provide a more convenient and less time-consuming means of communication is required.
In this study, an assistive device for speech- and hearing impaired disabilities has been developed based on the Body Sensor Network (BSN) technology. In the proposed system, real-time recognition of American Sign Language (ASL) finger spelling gestures is performed based on input signals acquired from a wireless sensor gloves. The recognized gestures will then be mapped into corresponding sounds using speech synthesizer. The main focus here is on the hierarchical framework for hand gesture recognition.
RELATED WORK
Automatic gesture recognition is a key component for the development of a gesture-based communication device. In general, most sign language postures can be recognized based on four distinctive features, namely shape, position, orientation, and motion sequence. With modern sensor technology, these main features can be captured and interpreted.
Research on hand gesture recognition has been conducted for various languages. Jiangqin et al. [2] proposed the use of Cyber glove with 15 sensors and a 3D motion tracker for capturing the input signal from the hands to recognize 26 words in Chinese Sign Language (CSL). In [3], a prototype for automatic recognition of 250 Taiwan Sign Language (TSL) vocabularies was developed. In the study, Data Gloves with 10 flex sensors were used for detecting finger flexion, a Pohelmus 3D tracker was used for detecting hand motion and five tact switches were used for detecting a particular point the finger it touch. In [4], k-means model was adopted for Korean Sign Language (KSL) finger spelling gesture recognition based on a simple and low-cost device equipped with tilt and flex sensors. Kim et al. [5] used fuzzy min-max neural network for recognizing the KSL gestures based on signals from Data Glove.
ON-LINE GESTURE RECOGNITION
The online gesture recognition model can be divided into two parts: segmentation with threshold-based and classification with a probabilistic model with rules for accuracy enhancement.
Segmentation with threshold is used for separation of the gestures which are shown when the set threshold of the particular gesture is reached.
Classification with a probabilistic model is designed to further enhance the recognition accuracy; we designed the recognition model based on the probability density functions. Based on cluster analysis of confused letters, each group of misclassifications can be fixed by either incorporating the use of temporal information from a linguistic model or a model with a different feature set.
IMPLEMENTATION
DATA COLLECTION
In order to validate the proposed method for model construction, an experiment on full-sentence recognition were conducted. So that the system can recognize English phrases and sentences, two more symbols, i.e. “space” (SP) and “full stop” (FS), were introduced in addition to the 26 ASL finger spelling gestures. They were defined to be “thumb up” and “thumb down” gestures, respectively, as shown in Fig 3.1. The two gestures were selected since they can be easily separated from other gestures in ASL finger spelling vocabulary set.
For model construction, a training dataset was collected from ten normal subjects, six females and four males aged between 20 to 26 years old. Each subject was asked to perform the 28 gestures five times while wearing the BSN sensor glove equipped with 5 flex sensors and a 3D accelerometer. A sampling rate of 50 Hz was used.
MODEL CONSTRUCTION
By applying the multivariate Gaussian model with five flex features on the training dataset, the confusion matrix shown in Fig. 3.4 was generated. The misclassifications are denoted by the non-zero off-diagonal elements of the matrix. From the dependency graph generate by these misclassifications, clusters of confused letters can be systematically identified. Fig. 3.5 illustrates the six confused clusters derived from the confusion matrix. Note that only misclassifications that occur more than once were considered. Rules can then be derived for each of the confused clusters to further enhance the recognition accuracy.
Misclassifications in each confused cluster can be solved either by using multivariate Gaussian model with a more discriminative set of features or a bigram model as an alternative model. To identify the discriminative feature set for each confused cluster, thematic-objective Bayesian Framework for Feature Selection(multi-objective BFFS) [12] was applied on each portion of the training dataset consisting of a subset of confused letters.
CONCLUSION
The proposed framework for constructing the ASL finger spelling gesture recognition model based on the data acquired from a wireless BSN sensor glove. The glove consists of five flex sensors and a 3D accelerometer providing a measure of finger bending, as well as motion and orientation of the hand. Based on the framework, a hierarchical model for recognizing ASL finger spelling gestures has been adopted. The subject-independent recognition results on the “The quick brown fox jumps over the lazy dog.” pangram indicate that the model with rules for accuracy enhancement outperforms the original multivariate Gaussian model. A straightforward method for further improving the recognition accuracy is to compare each detected word against a close vocabulary set. Any mismatched word can be corrected before forwarding the text to a speech synthesizer.