Dynamic-MLCS: Fast searching for dynamic multiple longest common subsequences in sequence stream data

Yuanyuan Fu, Chunyang Wang, Jixin Zhu, Qun Zhang, Yiuming Cheung, Yuping Wang*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

In recent years, data stream mining has become a popular research topic, wherein the dynamic multiple longest common subsequence problem for sequence streams (i.e., D-MLCS) is an important issue with a vast range of applications. However, the existing methods are only for static multiple longest common subsequence problem (i.e., S-MLCS) and there are no algorithms for the D-MLCS problem. Using the existing algorithms for the S-MLCS problem to solve the D-MLCS problem often leads to an overload of the main memory and is extremely time consuming as all the sequence segments should be stored into the main memory when the new segments flow in, and all updated problems should be resolved. To overcome this shortcoming, we designed a scheme for constructing dynamic Directed Acyclic Graphs (DAGs). It only needs to update the current DAG post each sequence segment flows in, without constructing the DAG from the beginning. We propose a novel algorithm for the D-MLCS problem called Dynamic-MLCS, which only needs to store a part of the key information of each streaming sequence segments, instead of storing all streaming sequence segments and scanning them repeatedly. Theoretical analyses demonstrate that the time and space complexities of the proposed algorithm are only relevant and linear to the number of the incomplete points and newly generated nodes, and the experiments conducted on the datasets of sequence streams indicate that the proposed Dynamic-MLCS is efficient and performs better than the compared state-of-the-art algorithms.

Original languageEnglish
Article number111654
JournalKnowledge-Based Systems
Volume293
DOIs
Publication statusPublished - 7 Jun 2024

Scopus Subject Areas

  • Software
  • Management Information Systems
  • Information Systems and Management
  • Artificial Intelligence

User-Defined Keywords

  • Complete point
  • Directed Acyclic Graphs (DAG)
  • Incomplete point
  • Multiple Longest Common Subsequences (MLCS)
  • Sequence stream

Fingerprint

Dive into the research topics of 'Dynamic-MLCS: Fast searching for dynamic multiple longest common subsequences in sequence stream data'. Together they form a unique fingerprint.

Cite this