13-09-2017, 01:07 PM
In computational linguistics, word sense disambiguation (WSD) is an open problem of natural language processing and ontology. WSD is identifying which sense of a word (meaning meaning) is used in a sentence, when the word has multiple meanings. The solution to this problem affects other writings related to the computer, such as speech, improving the relevance of search engines, resolution of anaphora, coherence, inference, and so on.
The human brain is quite competent in the disambiguation of the meaning of the word. That natural language is formed in a way that requires so much of it is a reflection of that neurological reality. In other words, human language developed in a way that reflects (and has also helped to shape) the innate capacity provided by the neural networks of the brain. In computing and information technology it allows, it has been a long-term challenge to develop the ability of computers to do natural language processing and automatic learning.
A rich variety of techniques have been investigated, from dictionary-based methods that use knowledge encoded in lexical resources to supervised methods of automatic learning in which a classifier is trained for each distinct word in a corpus of manually annotated examples unsupervised methods that group the occurrences of words, thus inducing the meanings of the word. Among them, supervised learning approaches have been the most successful algorithms to date.
The accuracy of current algorithms is difficult to state without a series of caveats. In English, precision at the coarse grain level (homographs) is routinely above 90%, with some methods in particular homographs achieving over 96%. In the fine-grain sense distinctions, the upper accuracies of 59.1% to 69.0% have been reported in the evaluation exercises (SemEval-2007, Senseval-2), where the basal accuracy of the simplest algorithm possible to always choose The most common sense was 51.4% and 57%, respectively.
The human brain is quite competent in the disambiguation of the meaning of the word. That natural language is formed in a way that requires so much of it is a reflection of that neurological reality. In other words, human language developed in a way that reflects (and has also helped to shape) the innate capacity provided by the neural networks of the brain. In computing and information technology it allows, it has been a long-term challenge to develop the ability of computers to do natural language processing and automatic learning.
A rich variety of techniques have been investigated, from dictionary-based methods that use knowledge encoded in lexical resources to supervised methods of automatic learning in which a classifier is trained for each distinct word in a corpus of manually annotated examples unsupervised methods that group the occurrences of words, thus inducing the meanings of the word. Among them, supervised learning approaches have been the most successful algorithms to date.
The accuracy of current algorithms is difficult to state without a series of caveats. In English, precision at the coarse grain level (homographs) is routinely above 90%, with some methods in particular homographs achieving over 96%. In the fine-grain sense distinctions, the upper accuracies of 59.1% to 69.0% have been reported in the evaluation exercises (SemEval-2007, Senseval-2), where the basal accuracy of the simplest algorithm possible to always choose The most common sense was 51.4% and 57%, respectively.