Structure-sensitive learning of text types

Autor(en): Geibel, P.
Krumnack, U. 
Pustylnikov, O.
Mehler, A.
Gust, H.
Kühnberger, K.-U. 
Stichwörter: Classification (of information); Learning systems; Trees (mathematics), Logical document structure; Structural features, Text processing
Erscheinungsdatum: 2007
Herausgeber: Springer Verlag
Journal: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen: 4830 LNAI
Startseite: 642
Seitenende: 646
Zusammenfassung: 
In this paper, we discuss the structure based classification of documents based on their logical document structure, i.e., their DOM trees. We describe a method using predefined structural features and also four tree kernels suitable for such structures. We evaluate the methods experimentally on a corpus containing the DOM trees of newspaper articles, and on the well-known SUSANNE corpus. We will demonstrate that, for the two corpora, many text types can be learned based on structural features only. © Springer-Verlag Berlin Heidelberg 2007.
Beschreibung: 
Conference of 20th Australian Joint Conference on Artificial Intelligence, AI 2007 ; Conference Date: 2 December 2007 Through 6 December 2007; Conference Code:71206
ISBN: 9783540769262
ISSN: 03029743
DOI: 10.1007/978-3-540-76928-6_68
Externe URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-38349068707&doi=10.1007%2f978-3-540-76928-6_68&partnerID=40&md5=d0cda235df49be1177039638371d3d05

Zur Langanzeige

Seitenaufrufe

1
Letzte Woche
1
Letzter Monat
0
geprüft am 14.05.2024

Google ScholarTM

Prüfen

Altmetric