03-11-2016, 10:19 AM
Enabling Efficient Multi-Keyword Ranked Search Over Encrypted Mobile Cloud
Data Through Blind Storage
1463730634-multikeyword.pdf (Size: 4.3 MB / Downloads: 9)
ABSTRACT In mobile cloud computing, a fundamental application is to outsource the mobile data to
external cloud servers for scalable data storage. The outsourced data, however, need to be encrypted due
to the privacy and confidentiality concerns of their owner. This results in the distinguished difficulties on
the accurate search over the encrypted mobile cloud data. To tackle this issue, in this paper, we develop the
searchable encryption for multi-keyword ranked search over the storage data. Specifically, by considering
the large number of outsourced documents (data) in the cloud, we utilize the relevance score and k-nearest
neighbor techniques to develop an efficient multi-keyword search scheme that can return the ranked search
results based on the accuracy. Within this framework, we leverage an efficient index to further improve the
search efficiency, and adopt the blind storage system to conceal access pattern of the search user. Security
analysis demonstrates that our scheme can achieve confidentiality of documents and index, trapdoor privacy,
trapdoor unlinkability, and concealing access pattern of the search user. Finally, using extensive simulations,
we show that our proposal can achieve much improved efficiency in terms of search functionality and search
time compared with the existing proposals.
INTRODUCTION
Mobile cloud computing [1]–[4] gets rid of the
hardware limitation of mobile devices by exploring the
scalable and virtualized cloud storage and computing
resources, and accordingly is able to provide much more
powerful and scalable mobile services to users. In mobile
cloud computing, mobile users typically outsource their data
to external cloud servers, e.g., iCloud, to enjoy a stable,
low-cost and scalable way for data storage and access.
However, as outsourced data typically contain sensitive
privacy information, such as personal photos, emails,
etc., which would lead to severe confidentiality and privacy
violations [5], if without efficient protections. It is therefore necessary to encrypt the sensitive data before outsourcing
them to the cloud. The data encryption, however, would
result in salient difficulties when other users need to access
interested data with search, due to the difficulties of search
over encrypted data. This fundamental issue in mobile
cloud computing accordingly motivates an extensive body of
research in the recent years on the investigation of searchable
encryption technique to achieve efficient searching over
outsourced encrypted data [6]–[9].
A collection of research works have recently been developed
on the topic of multi-keyword search over encrypted
data. Cash et al. [10] propose a symmetric searchable
encryption scheme which achieves high efficiency for large
databases with modest scarification on security guarantees.
Cao et al. [11] propose a multi-keyword search scheme supporting
result ranking by adopting k-nearest neighbors (kNN)
technique [12]. Naveed et.al. [13] propose a dynamic searchable
encryption scheme through blind storage to conceal
access pattern of the search user.
In order to meet the practical search requirements, search
over encrypted data should support the following three functions.
First, the searchable encryption schemes should support
multi-keyword search, and provide the same user experience
as searching in Google search with different keywords;
single-keyword search is far from satisfactory by only returning
very limited and inaccurate search results. Second, to
quickly identify most relevant results, the search user would
typically prefer cloud servers to sort the returned search
results in a relevance-based order [14] ranked by the relevance
of the search request to the documents. In addition, showing
the ranked search to users can also eliminate the unnecessary
network traffic by only sending back the most relevant results
from cloud to search users. Third, as for the search efficiency,
since the number of the documents contained in a database
could be extraordinarily large, searchable encryption schemes
should be efficient to quickly respond to the search requests
with minimum delays.
In contrast to the theoretical benefits, most of the existing
proposals, however, fail to offer sufficient insights towards
the construction of full functioned searchable encryption
as described above. As an effort towards the issue, in
this paper, we propose an efficient multi-keyword ranked
search (EMRS) scheme over encrypted mobile cloud data
through blind storage. Our main contributions can be summarized
as follows:
• We introduce a relevance score in searchable encryption
to achieve multi-keyword ranked search over the
encrypted mobile cloud data. In addition to that, we construct
an efficient index to improve the search efficiency.
• By modifying the blind storage system in the EMRS,
we solve the trapdoor unlinkability problem and conceal
access pattern of the search user from the cloud server.
• We give thorough security analysis to demonstrate that
the EMRS can reach a high security level including con-
fidentiality of documents and index, trapdoor privacy,
trapdoor unlinkability, and concealing access pattern
of the search user. Moreover, we implement extensive
experiments, which show that the EMRS can achieve
enhanced efficiency in the terms of functionality and
search efficiency compared with existing proposals.
The remainder of this paper is organized as follows.
In Section II, the system model, security requirements and
design goal are formalized. In Section III, we recap relevance
scoring, secure kNN technique, blind storage system and
ciphertext policy attribute-based encryption. In Section IV,
we propose the EMRS. Its security analysis and performance
evaluation are presented in Section V and Section VI, respectively.
In Section VII, we present related work. Finally, we
conclude this paper in Section VIII.
SYSTEM MODEL, SECURITY REQUIREMENTS
AND DESIGN GOAL
A. SYSTEM MODEL
As shown in Fig. 1, the system model in the EMRS consists
of three entities: data owner, search users and cloud server.
The data owner keeps a large collection of documents D to be
outsourced to a cloud server in an encrypted form C. In the
system, the data owner sets a keyword dictionary W which
contains d keywords. To enable search users to query over
the encrypted documents, the data owner builds the encrypted
index z. Both the encrypted documents C and encrypted
index z are stored on the cloud server through blind storage
system.
When a search user wants to search over the encrypted
documents, she first receives the secret key from the data
owner. Then, she chooses a conjunctive keyword set $ which
contains l interested keywords and computes a trapdoor T
including a keyword-related token stag and the encrypted
query vector Q. Finally, the search user sends stag, Q, and
an optional number k to the cloud server to request the most
k relevant results.
Upon receiving stag, Q, and k from the search user, the
cloud server uses the stag to access the index z in the blind
storage and computes the relevance scores with the encrypted
query vector Q. Then, the cloud server sends back descriptors
(Dsc) of the top-k documents that are most relevant to the
searched keywords. The search user can use these descriptors
to access the blind storage system to retrieve the encrypted
documents. An access control technique, e.g., attribute-based
encryption, can be implemented to manage the search user’s
decryption capability.
B. SECURITY REQUIREMENTS
In the EMRS, we consider the cloud server to be curious
but honest which means it executes the task assigned by
the data owner and the search user correctly. However, it is
curious about the data in its storage and the received trapdoors
to obtain additional information. Moreover, we consider the
Knowing Background model in the EMRS, which allows
the cloud server to know more background information of
the documents such as statistical information of the keywords.
Specifically, the EMRS aims to provide the following four
security requirements:
• Confidentiality of Documents and Index: Documents
and index should be encrypted before being outsourced
to a cloud server. The cloud server should be prevented
from prying into the outsourced documents and cannot
deduce any associations between the documents and
keywords using the index.
• Trapdoor Privacy: Since the search user would like
to keep her searches from being exposed to the cloud
server, the cloud server should be prevented from knowing
the exact keywords contained in the trapdoor of the
search user.
• Trapdoor Unlinkability: The trapdoors should not be
linkable, which means the trapdoors should be totally
different even if they contain the same keywords.
In other words, the trapdoors should be randomized
rather than determined. The cloud server cannot deduce
any associations between two trapdoors.
• Concealing Access Pattern of the Search User: Access
pattern is the sequence of the searched results. In the
EMRS, the access pattern should be totally concealed
from the cloud server. Specifically, the cloud server
cannot learn the total number of the documents stored
on it nor the size of the searched document even when
the search user retrieves this document from the cloud
server.
C. DESIGN GOAL
To enable efficient and privacy-preserving multi-keyword
ranked search over encrypted mobile cloud data via blind
storage system, the EMRS has following design goals:
• Multi-Keyword Ranked Search: To meet the requirements
for practical uses and provide better user experience,
the EMRS should not only support multi-keyword
search over encrypted mobile cloud data, but also
achieve relevance-based result ranking.
• Search Efficiency: Since the number of the total documents
may be very large in a practical situation,
the EMRS should achieve sublinear search with better
search efficiency.
• Confidentiality and Privacy Preservation: To prevent the
cloud server from learning any additional information
about the documents and the index, and to keep search
users’ trapdoors secret, the EMRS should cover all the
security requirements that we introduced above.
III. PRELIMINARIES
A. RELEVANCE SCORING
In searchable symmetric encryption (SSE) schemes, due to a
large number of documents, search results should be retrieved
in an order of the relevancy with the searched keywords.
Scoring is the natural way to weight the relevancy of the documents.
Among many relevance scoring techniques, we adopt
TF-IDF weighting [15] in the EMRS. In TF-IDF weighting,
term frequency tft, f refers to the number of term t in a document f. Inverse document frequency is calculated as
idft = log N
dft
, where dft denotes the number of documents
which contain term t and N refers to the total number of
documents in the database. Then, the weighting of term t in a
document f can be calculated as tft,f ∗ idft
.
B. SECURE kNN COMPUTATION
We adopt the work of Wong et al. [12] in the EMRS. Wong
et al. propose a secure k-nearest neighbor (kNN) scheme
which can confidentially encrypt two vectors and compute
Euclidean distance of them. First, the secret key (S, M1, M2)
should be generated. The binary vector S is a splitting indicator
to split plaintext vector into two random vectors, which
can confuse the value of plaintext vector. And M1 and M2 are
used to encrypt the split vectors. The correctness and security
of secure kNN computation scheme can be referred to [12].
C. BLIND STORAGE SYSTEM
A blind storage system [13] is built on the cloud server to support
adding, updating and deleting documents and concealing
the access pattern of the search user from the cloud server.
In the blind storage system, all documents are divided into
fixed-size blocks. These blocks are indexed by a sequence
of random integers generated by a document-related seed.
In the view of a cloud server, it can only see the blocks of
encrypted documents uploaded and downloaded. Thus, the
blind storage system leaks little information to the cloud
server. Specifically, the cloud server does not know which
blocks are of the same document, even the total number of
the documents and the size of each document. Moreover, all
the documents and index can be stored in the blind storage
system to achieve a searchable encryption scheme.
D. CIPHERTEXT POLICY ATTRIBUTE-BASED
ENCRYPTION
In ciphertext policy attribute-based encryption
(CP-ABE) [16], ciphertexts are created with an access structure
(usually an access tree) which defines the access policy.
A user can decrypt the data only if the attributes embedded in
his attribute keys satisfy the access policy in the ciphertext.
In CP-ABE, the encrypter holds the ultimate authority of the
access policy.
IV. PROPOSED SCHEME
In this section, we propose the detailed EMRS. Since the
encrypted documents and index z are both stored in the
blind storage system, we would provide the general construction
of the blind storage system. Moreover, since the EMRS
aims to eliminate the risk of sharing the key that is used
to encrypt the documents with all search users and solve
the trapdoor unlinkability problem in Naveed’s scheme [13],
we modify the construction of blind storage and leverage
ciphertext policy attribute-based encryption (CP-ABE)
technique in the EMRS. However, specific construction of
CP-ABE is out of scope of this paper and we only give
a simple indication here.
SYSTEM SETUP
The data owner takes a security parameter λ, and outputs
two invertible matrixes M1, M2 ∈ R
(d+2)∗(d+2) as well as
a (d+2)-dimension binary vector S as the secret key, where
d represents the size of the keyword dictionary. Then, the data
owner generates a set of attribute keys sk for each search user
according to her role in the system. The data owner chooses
a key KT for a symmetric cryptography Enc(), e.g., AES.
Finally, the data owner sends (M1, M2, S,sk, Enc(),KT ) to
the search user through a secure channel.
B. CONSTRUCTION OF BLIND STORAGE
The data owner chooses a full-domain collusion resistant
hash function H, a full-domain pseudorandom function 9,
a pseudorandom generator 0 and a hash function 8: {0, 1}
∗→
{0, 1}
192
. 9 and 0 are based on the AES block-cipher [13].
Then, the data owner chooses a number α > 1 that defines
the expansion parameter and a number κ that denotes the
minimum number of blocks in a communication.
1) B.KEYGEN
The data owner generates a key K9 for the function 9 and
sends it to the search user using a secure channel.
2) B.BUILD
This phase takes into a large collection of documents D. D is
a list of documents (d1, d2, d3 · · · dm) containing m documents.
where each document has a unique id denoted as idi
.
The B.Build outputs an array of blocks B, which consists
of nb blocks of mb bits each. For document di
, it contains
sizei blocks of mb bits each and each header of these blocks
contains the H(idi). In addition, the header of the first block
of the document di
indicates the size of di
. At the beginning,
we initialize all blocks in B with all 0. For each document
di
in D, we construct the blind storage as follows:
Step 1: Compute the seed σi = 9K9
(idi) as the input
of the function 0. Generate a sufficiently long bit-number
through the function 0 using the seed σi and parse it
as a sequence of integers in the range [nb]. Let π[σi, l]