Chinese document layout analysis based on adaptive split-and-merge and qualitative spatial reasoning

Jiming LIU*, Yuan Y. Tang, Ching Y. Suen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

18 Citations (Scopus)

Abstract

The ultimate goal of automatic document processing is to understand the semantics of a document. Towards such an end, one of the primary enabling steps has been to first reason about the layout of the document by means of page segmentation and segment spatial reasoning or labeling. This, in turn, allows for the derivation of document logical organization. This paper describes a generic document segmentation and geometric relation labeling method with applications to Chinese document analysis. Unlike the previous document segmentation methods where text spacing, border lines, and/or a priori layout models based on template matching processing are performed, the present method begins with a hierarchy of partitioned image layers where inhomogeneous higher-level regions are recursively partitioned into lower-level rectangular subregions and at the same time lower-level smaller homogeneous regions are merged into larger homogeneous regions. Furthermore, the derived segment data structure readily enables efficient search for geometric relationships between identified document segments.

Original languageEnglish
Pages (from-to)1265-1278
Number of pages14
JournalPattern Recognition
Volume30
Issue number8
DOIs
Publication statusPublished - Aug 1997

Scopus Subject Areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

User-Defined Keywords

  • Adaptive split-and-merge
  • Chinese document processing
  • Geometric structure
  • Segment spatial reasoning

Fingerprint

Dive into the research topics of 'Chinese document layout analysis based on adaptive split-and-merge and qualitative spatial reasoning'. Together they form a unique fingerprint.

Cite this