TY - GEN
T1 - Learning hidden markov model topology based on KL divergence for information extraction
AU - Au, Kwok Chung
AU - Cheung, Kwok Wai
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2004.
PY - 2004
Y1 - 2004
N2 - To locate information embedded in documents, information extraction systems based on rule-based pattern matching have long been used. To further improve the extraction generalization, hidden Markov model (HMM) has recently been adopted for modeling temporal variations of the target patterns with promising results. In this paper, a state-merging method is adopted for learning the topology with the use of a localized Kullback Leibler (KL) divergence. The proposed system has been applied to a set of domain-specific job advertisements and preliminary experiments show promising results.
AB - To locate information embedded in documents, information extraction systems based on rule-based pattern matching have long been used. To further improve the extraction generalization, hidden Markov model (HMM) has recently been adopted for modeling temporal variations of the target patterns with promising results. In this paper, a state-merging method is adopted for learning the topology with the use of a localized Kullback Leibler (KL) divergence. The proposed system has been applied to a set of domain-specific job advertisements and preliminary experiments show promising results.
UR - http://www.scopus.com/inward/record.url?scp=7444239829&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-24775-3_70
DO - 10.1007/978-3-540-24775-3_70
M3 - Conference proceeding
AN - SCOPUS:7444239829
SN - 354022064X
SN - 9783540220640
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 590
EP - 594
BT - Advances in Knowledge Discovery and Data Mining - 8th Pacific-Asia Conference, PAKDD 2004, Proceedings
A2 - Dai, Honghua
A2 - Srikant, Ramakrishnan
A2 - Zhang, Chengqi
PB - Springer Verlag
T2 - 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2004
Y2 - 26 May 2004 through 28 May 2004
ER -