Harvesting relations from the web - Quantifiying the impact of filtering functions

Autor(en): Blohm, S.
Cimiano, P.
Stemle, E.
Stichwörter: Data structures; Iterative methods; Pattern recognition; Statistical methods; World Wide Web, Datasets; Filtering functions; Tuple evaluation measures, Learning algorithms
Erscheinungsdatum: 2007
Journal: Proceedings of the National Conference on Artificial Intelligence
Volumen: 2
Startseite: 1316
Seitenende: 1321
Zusammenfassung: 
Several bootstrapping-based relation extraction algorithms working on large corpora or on the Web have been presented in the literature. A crucial issue for such algorithms is to avoid the introduction of too much noise into further iterations. Typically, this is achieved by applying appropriate pattern and tuple evaluation measures, henceforth called filtering functions, thereby selecting only the most promising patterns and tuples. In this paper, we systematically compare different filtering functions proposed across the literature. Although we also discuss our own implementation of a pattern learning algorithm, the main contribution of the paper is actually the extensive comparison and evaluation of the different filtering functions proposed in the literature with respect to seven datasets. Our results indicate that some of the commonly used filters do not outperform a trivial baseline filter in a statistically significant manner. Copyright © 2007, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Beschreibung: 
Conference of AAAI-07/IAAI-07 Proceedings: 22nd AAAI Conference on Artificial Intelligence and the 19th Innovative Applications of Artificial Intelligence Conference ; Conference Date: 22 July 2007 Through 26 July 2007; Conference Code:70600
ISBN: 9781577353232
Externe URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-36348944595&partnerID=40&md5=64c81800cbe9ea2b9cde63d1cfa58274

Show full item record

Page view(s)

1
Last Week
0
Last month
0
checked on May 17, 2024

Google ScholarTM

Check

Altmetric