15th European Conference on Artificial Intelligence
|July 21-26 2002 Lyon France
James Stuart Aitken
The objective of this work is to learn information extraction rules by applying Inductive Logic Programming (ILP) techniques to natural language data. The approach is ontology-based, which means that the extraction rules conclude with specific ontology relations that characterise the meaning of sentences in the text. An existing ILP system, FOIL, is used to learn attribute-value relations. This enables instances of these relations to be identified in the text. In specific, we explore the linguistic preprocessing of the data, the use of background knowledge in the learning process, and the practical considerations of applying a supervised learning approach to rule induction, i.e. in terms the human effort in creating the data set, and in the inherent biases in the use of small data sets.
Keywords: Information Extraction, Inductive Logic Programming, Ontologies
Citation: James Stuart Aitken: Learning Information Extraction Rules: An Inductive Logic Programming approach. In F. van Harmelen (ed.): ECAI2002, Proceedings of the 15th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2002, pp.355-359.