Gerard Escudero, Lluís Màrquez, German Rigau
This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar-based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing information about the comparison between both methods appearing in the related literature. In doing so, several directions have been explored, including: testing several modifications of the basic learning algorithms and varying the feature space. Secondly, an improvement of both algorithms is proposed, in order to deal with large attribute sets. This modification, which basically consists in using only the positive information appearing in the examples, allows to improve greatly the efficiency of the methods, with no loss in accuracy. The experiments have been performed on the largest sense-tagged corpus avaliable containing the most frequent and ambiguous English words. Results show that the Exemplar-based approach to WSD is generally superior to the Bayesian approach, especially when a specific metric for dealing with symbolic attributes is used.
Keywords: Natural Language Processing, Machine Learning, Human Language Processing, Bayesian and Case-Based Learning
Citation: Gerard Escudero, Lluís Màrquez, German Rigau: Naive Bayes and Exemplar-based Approaches to Word Sense Disambiguation Revisited. In W.Horn (ed.): ECAI2000, Proceedings of the 14th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2000, pp.421-425.