Distributed Publish/Subscribe Query Processing on the Spatio-Textual Data Stream

Zhida Chen, Gao Cong, Zhenjie Zhang, Tom Z.J. Fu, Lisi Chen

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

53 Citations (Scopus)

Abstract

Huge amount of data with both space and text information, e.g., geo-Tagged tweets, is flooding on the Internet. Such spatio-Textual data stream contains valuable information for millions of users with various interests on different keywords and locations. Publish/subscribe systems enable efficient and effective information distribution by allowing users to register continuous queries with both spatial and textual constraints. However, the explosive growth of data scale and user base has posed challenges to the existing centralized publish/subscribe systems for spatiotextual data streams. In this paper, we propose our distributed publish/subscribe system, called PS2Stream, which digests a massive spatio-Textual data stream and directs the stream to target users with registered interests. Compared with existing systems, PS2Stream achieves a better workload distribution in terms of both minimizing the total amount of workload and balancing the load of workers. To achieve this, we propose a new workload distribution algorithm considering both space and text properties of the data. Additionally, PS2Stream supports dynamic load adjustments to adapt to the change of the workload, which makes PS2Stream adaptive. Extensive empirical evaluation, on commercial cloud computing platform with real data, validates the superiority of our system design and advantages of our techniques on system performance improvement.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017
PublisherIEEE Computer Society
Pages1095-1106
Number of pages12
ISBN (Electronic)9781509065431
ISBN (Print)9781509065448
DOIs
Publication statusPublished - 19 Apr 2017
Event33rd IEEE International Conference on Data Engineering, ICDE 2017 - San Diego, United States
Duration: 19 Apr 201722 Apr 2017

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Electronic)2375-026X

Conference

Conference33rd IEEE International Conference on Data Engineering, ICDE 2017
Country/TerritoryUnited States
CitySan Diego
Period19/04/1722/04/17

Fingerprint

Dive into the research topics of 'Distributed Publish/Subscribe Query Processing on the Spatio-Textual Data Stream'. Together they form a unique fingerprint.

Cite this