Abstract
In recent years, data stream mining has become a popular research topic, wherein the dynamic multiple longest common subsequence problem for sequence streams (i.e., D-MLCS) is an important issue with a vast range of applications. However, the existing methods are only for static multiple longest common subsequence problem (i.e., S-MLCS) and there are no algorithms for the D-MLCS problem. Using the existing algorithms for the S-MLCS problem to solve the D-MLCS problem often leads to an overload of the main memory and is extremely time consuming as all the sequence segments should be stored into the main memory when the new segments flow in, and all updated problems should be resolved. To overcome this shortcoming, we designed a scheme for constructing dynamic Directed Acyclic Graphs (DAGs). It only needs to update the current DAG post each sequence segment flows in, without constructing the DAG from the beginning. We propose a novel algorithm for the D-MLCS problem called Dynamic-MLCS, which only needs to store a part of the key information of each streaming sequence segments, instead of storing all streaming sequence segments and scanning them repeatedly. Theoretical analyses demonstrate that the time and space complexities of the proposed algorithm are only relevant and linear to the number of the incomplete points and newly generated nodes, and the experiments conducted on the datasets of sequence streams indicate that the proposed Dynamic-MLCS is efficient and performs better than the compared state-of-the-art algorithms.
Original language | English |
---|---|
Article number | 111654 |
Journal | Knowledge-Based Systems |
Volume | 293 |
DOIs | |
Publication status | Published - 7 Jun 2024 |
Scopus Subject Areas
- Software
- Management Information Systems
- Information Systems and Management
- Artificial Intelligence
User-Defined Keywords
- Complete point
- Directed Acyclic Graphs (DAG)
- Incomplete point
- Multiple Longest Common Subsequences (MLCS)
- Sequence stream