Abstract
To locate information embedded in documents, information extraction systems based on rule-based pattern matching have long been used. To further improve the extraction generalization, hidden Markov model (HMM) has recently been adopted for modeling temporal variations of the target patterns with promising results. In this paper, a state-merging method is adopted for learning the topology with the use of a localized Kullback Leibler (KL) divergence. The proposed system has been applied to a set of domain-specific job advertisements and preliminary experiments show promising results.
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining - 8th Pacific-Asia Conference, PAKDD 2004, Proceedings |
Editors | Honghua Dai, Ramakrishnan Srikant, Chengqi Zhang |
Publisher | Springer Verlag |
Pages | 590-594 |
Number of pages | 5 |
ISBN (Print) | 354022064X, 9783540220640 |
DOIs | |
Publication status | Published - 2004 |
Event | 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2004 - Sydney, Australia Duration: 26 May 2004 → 28 May 2004 https://link.springer.com/book/10.1007/b97861 (Conference proceedings) |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 3056 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2004 |
---|---|
Abbreviated title | PAKDD 2004 |
Country/Territory | Australia |
City | Sydney |
Period | 26/05/04 → 28/05/04 |
Internet address |
|
Scopus Subject Areas
- Theoretical Computer Science
- General Computer Science