Authenticated Queries for Cloud-Assisted Multi-Source Data Collection

Project: Research project

Project Details


The convergence of sensor technologies, crowdsourcing, and cloud computing has given rise to a brand-new paradigm for cloud-assisted multi-source data collection, where data is collected by physically distributed collectors and outsourced to the cloud (such as Amazon EC2 and Google App Engine) to provide query services. This paradigm has numerous potential applications across a wide variety of domains such as marketing, transport, healthcare, and environmental monitoring. For example, designated price collectors can collect price data for daily commodity items such as milk powder in various regions and shops and report it to a cloud server; customers can then query the price data through the cloud server (for example, to find the lowest prices for a given item). While this model is appealing in terms of cost, performance, and flexibility, it raises the issue of query integrity. If the cloud server is untrustworthy or compromised, it may return incorrect or incomplete query results to clients and possibly mislead them, leading to wrong decisions being made. The consequences could be disastrous for many critical applications. Therefore, empowering clients to authenticate query results is imperative in such cloud-assisted multi-source data collection and query services.

To guarantee the integrity of query results, query authentication techniques for outsourced databases and distributed networks have been extensively studied. However, existing studies (including our own previous work) are limited to a single data source/owner1 and/or simple in- network aggregations. Little work has been done on query authentication involving multiple data sources/owners and supporting general queries such as range search and k-nearest- neighbor (kNN) search, as required for cloud-assisted multi-source data collection and query services. In this project, we propose to conduct a thorough investigation of this new multi- source query authentication problem. The research challenges are threefold. Firstly, as the authentication data structures (or signatures) would be generated by the data sources in a distributed manner, a fundamental issue is how to guarantee the completeness of query results. Secondly, in order to minimize result verification overheads, authentication data structures should be generic and aggregatable in the cloud so as to efficiently support various types of queries. Thirdly, some data-sensitive applications may require personal data privacy to be protected against the cloud or clients, entailing further privacy-preserving query authentication techniques in multi-source data collection environments.

To address these challenges, our research agenda includes: 1) the design of generic, aggregatable authentication data structures that exploit the common properties of the underlying data; 2) the development of query processing and data updating algorithms for efficient query authentication on both one- and multi-dimensional data; 3) the exploration of multi-source query authentication techniques with privacy-preserving requirements; and 4) security analysis and performance evaluation of the proposed data structures and query authentication algorithms using both theoretical and empirical studies. Finally, we will develop a proof-of-concept prototype system to demonstrate the success of our proposals through practical application. With our extensive research experience in query authentication and privacy-preserving query processing, we expect the outcomes of this project to benefit both users and service providers in the cloud/crowd computing industry.
Effective start/end date1/01/1531/12/17


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.