15th European Conference on Artificial Intelligence
  July 21-26 2002     Lyon     France  

ECAI-2002 Conference Paper

[PDF] [full paper] [prev] [tofc] [next]

Learning Information Extraction Rules: An Inductive Logic Programming approach

James Stuart Aitken

The objective of this work is to learn information extraction rules by applying Inductive Logic Programming (ILP) techniques to natural language data. The approach is ontology-based, which means that the extraction rules conclude with specific ontology relations that characterise the meaning of sentences in the text. An existing ILP system, FOIL, is used to learn attribute-value relations. This enables instances of these relations to be identified in the text. In specific, we explore the linguistic preprocessing of the data, the use of background knowledge in the learning process, and the practical considerations of applying a supervised learning approach to rule induction, i.e. in terms the human effort in creating the data set, and in the inherent biases in the use of small data sets.

Keywords: Information Extraction, Inductive Logic Programming, Ontologies

Citation: James Stuart Aitken: Learning Information Extraction Rules: An Inductive Logic Programming approach. In F. van Harmelen (ed.): ECAI2002, Proceedings of the 15th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2002, pp.355-359.

[prev] [tofc] [next]

ECAI-2002 is organised by the European Coordinating Committee for Artificial Intelligence (ECCAI) and hosted by the UniversitÚ Claude Bernard and INSA, Lyon, on behalf of Association Franšaise pour l'Intelligence Artificielle.