Linear Suffix Array Construction by Almost Pure Induced-Sorting

Ge Nong, Sen Zhang, Wai Hong Chan

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

62 Citations (Scopus)

Abstract

We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm.The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm)used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps.The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code.The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction.
Original languageEnglish
Title of host publication2009 Data Compression Conference
EditorsJames A. Storer, Michael W. Marcellin
Place of PublicationLos Alamitos, CA
PublisherIEEE Computer Society
Pages193-202
Number of pages10
ISBN (Print)9780769535920, 9781424437535
DOIs
Publication statusPublished - 16 Mar 2009
Externally publishedYes
Event2009 Data Compression Conference - Snowbird, UT, USA
Duration: 16 Mar 200918 Mar 2009

Publication series

NameData Compression Conference (DCC)
PublisherIEEE Computer Society
ISSN (Print)1068-0314

Conference

Conference2009 Data Compression Conference
Period16/03/0918/03/09

User-Defined Keywords

  • Sorting
  • Educational institutions
  • Algorithm design and analysis
  • Data compression
  • Computer science
  • Sun
  • Mathematics
  • Data structures
  • Indexing
  • Information retrieval

Fingerprint

Dive into the research topics of 'Linear Suffix Array Construction by Almost Pure Induced-Sorting'. Together they form a unique fingerprint.

Cite this