Toward Secure and Dependable Storage Services In Cloud Computing
Abstract
Cloud storage enables users to remotely store their data and enjoy the on-demand high quality cloud applications without the burden of local hardware and software management. Though the benefits are clear, such a service is also relinquishing users’ physical possession of their outsourced data, which inevitably poses new security risks towards the correctness of the data in cloud. In order to address this new problem and further achieve a secure and dependable cloud storage service, we propose in this paper a flexible distributed storage integrity auditing mechanism, utilizing the homomorphic token and distributed erasure-coded data. The proposed design allows users to audit the cloud storage with very lightweight communication and computation cost.
INTRODUCTION
SEVERAL trends are opening up the era of cloud computing, which is an Internet-based development and use of computer technology. The ever cheaper and more powerful processors, together with the Software as a Service (SaaS) computing architecture, are transforming data centers into pools of computing service on a huge scale. The increasing Network bandwidth and reliable yet flexible network Connections make it even possible that users can now Subscribe high quality services from data and software that Reside solely on remote data centers.
PROBLEM STATEMENT
System Model
A representative network architecture for cloud storage service architecture is illustrated in Fig. 1.Three different network entities can be identified as follows:
User: an entity, who has data to be stored in the cloud and relies on the cloud for data torage and computation, can be either enterprise or individual customers.
Cloud Server (CS): an entity, which is managed by cloud service provider (CSP) to provide data storage service and has significant storage space and computation resources (we will not differentiate CS and CSP hereafter).
Third-Party Auditor: an optional TPA, who has expertise and capabilities that users may not have, is trusted to assess and expose risk of cloud storage services on behalf of the users upon request.
ENSURING CLOUD DATA STORAGE
In cloud data storage system, users store their data in the cloud and no longer possess the data locally. Thus, the correctness and availability of the data files being stored on the distributed cloud servers must be guaranteed. One of the key issues is to effectively detect any unauthorized data modification and corruption, possibly due to server compromise and/or random Byzantine failures. Besides, in the distributed case when such inconsistencies are successfully detected, to find which server the data error lies in is also of great significance, since it can always be the
first step to fast recover the storage errors and/or identifying potential threats of external attacks.
To address these problems, our main scheme for ensuring cloud data storage is presented in this section. The first part of the section is devoted to a review of basic tools from
coding theory that is needed in our scheme for file distribution across cloud servers. Subsequently, it is shown how to derive a challenge-response protocol for verifying the storage correctness as well as identifying misbehaving servers. The procedure for file retrieval and error recovery based on erasure-correcting code is also outlined.
File Distribution Preparation
It is well known that erasure-correcting code may be used to tolerate multiple failures in distributed storage systems. In cloud data storage, we rely on this technique to disperse the
data file F redundantly across a set of n ¼ m þ k distributed servers. By placing each of the m þ k vectors on a different server, the original data file can survive the failure of any k of the m þ k servers without any data loss, with a space overhead of k=m. For support of efficient sequential I/O to the original file, our file layout is systematic, i.e., the unmodified m data file vectors together with k parity vectors is distributed across m þ k different servers
File Retrieval and Error Recovery
Since our layout of file matrix is systematic, the user can reconstruct the original file by downloading the data vectors from the first m servers, assuming that they return the correct response values. Notice that our verification scheme is based on random spot-checking, so the storage correctness assurance is a probabilistic one. However, by choosing system parameters ðe:g:; r; l; tÞ appropriately and conducting enough times of verification, we can guarantee the successful file retrieval with high probability. On the other hand, whenever the data corruption is detected.
PROVIDING DYNAMIC DATA OPERATION SUPPORT
So far, we assumed that F represents static or archived data. This model may fit some application scenarios, such as libraries and scientific data sets. However, in cloud data storage, there are many potential scenarios where data stored in the cloud is dynamic, like electronic documents, photos, or log files, etc. Therefore, it is crucial to consider the dynamic case, where a user may wish to perform various block-level operations of update, delete, and append to modify the data file while maintaining the storage correctness assurance.