23-06-2012, 06:41 PM
IEEE 2012 JAVA & .NET PROJECTS
For These Projects Contact:
www.projects9.com
Phone no:09618855666, 08008855666
Email:projects[at]projects9.com
You can download the project topics from here:
[attachment=25353][attachment=25354]
1.A Network Coding Equivalent Content Distribution Scheme for Efficient Peer-to-Peer Interactive VoD Streaming
Abstract:
Although random access operations are desirable for on-demand video streaming in peer-to-peer systems, they are difficult to efficiently achieve due to the asynchronous interactive behaviors of users and the dynamic nature of peers. In this paper, we propose a network coding equivalent content distribution (NCECD) scheme to efficiently handle interactive video-on-demand (VoD) operations in peer-to-peer systems. In NCECD, videos are divided into segments that are then further divided into blocks. These blocks are encoded into independent blocks that are distributed to different peers for local storage. With NCECD, a new client only needs to connect to a sufficient number of parent peers to be able to view the whole video and rarely needs to find new parents when performing random access operations. In most existing methods, a new client must search for parent peers containing specific segments; however, NCECD uses the properties of network coding to cache quivalent content in peers, so that one can pick any parent without additional searches. Experimental results show that the proposed scheme achieves low startup and jump searching delays and requires fewer server resources. In addition, we present the analysis of system parameters to achieve reasonable block loss rates for the proposed scheme
2.Catching Packet Droppers and Modifiers in Wireless Sensor Networks
Abstract
Packet dropping and modification are common attacks that can be launched by an adversary to disrupt communication in wireless multihop sensor networks. Many schemes have been proposed to mitigate or tolerate such attacks, but very few can effectively and efficiently identify the intruders. To address this problem, we propose a simple yet effective scheme, which can identify misbehaving forwarders that drop or modify packets. Extensive analysis and simulations have been conducted to verify the effectiveness and efficiency of the scheme.
3.Cooperative Provable Data Possession for Integrity Verification in Multi-Cloud Storage
Abstract
Provable data possession (PDP) is a technique for ensuring the integrity of data in storage outsourcing. In this paper, we address the construction of an efficient PDP scheme for distributed cloud storage to support the scalability of service and data migration, in which we consider the existence of multiple cloud service providers to cooperatively store and maintain the clients’ data. We present a cooperative PDP (CPDP) scheme based on homomorphic verifiable response and hash index hierarchy. We prove the security of our scheme based on multi-prover zero-knowledge proof system, which can satisfy completeness, knowledge soundness, and zero-knowledge properties. In addition, we articulate performance optimization mechanisms for our
scheme, and in particular present an efficient method for selecting optimal parameter values to minimize the computation costs of clients and storage service providers. Our experiments show that our solution introduces lower computation and communication overheads in comparison with non-cooperative approaches.
4.Distributed Packet Buffers for High-Bandwidth Switches and Routers
Abstract
High-speed routers rely on well-designed packet buffers that support multiple queues, provide large capacity and short response times. Some researchers suggested combined SRAM/DRAM hierarchical buffer architectures to meet these challenges. However, these architectures suffer from either large SRAM requirement or high time-complexity in the memory management. In this paper, we present scalable, efficient, and novel distributed packet buffer architecture. Two fundamental issues need to be addressed to make this architecture feasible: 1) how to minimize the overhead of an individual packet buffer; and 2) how to design scalable packet buffers using independent buffer subsystems. We address these issues by first designing an efficient compact buffer that reduces the SRAM size requirement by (k _ 1Þ=k. Then, we introduce a feasible way of coordinating multiple subsystems with a load-balancing algorithm that maximizes the overall system performance. Both theoretical analysis and experimental results demonstrate that our load-balancing algorithm and the distributed packet buffer architecture can easily scale to meet the buffering needs of high bandwidth links and satisfy the requirements of scale and support for multiple queues.
5.Efficient Computation of Range Aggregates against Uncertain Location-Based Queries
Abstract
In many applications, including location-based services, queries may not be precise. In this paper, we study the problem of efficiently computing range aggregates in a multidimensional space when the query location is uncertain. Specifically, for a query point Q whose location is uncertain and a set S of points in a multidimensional space, we want to calculate the aggregate (e.g., count, average and sum) over the subset S0 of S such that for each p 2 S0, Q has at least probability _ within the distance _ to p. We propose novel, efficient techniques to solve the problem following the filtering-and-verification paradigm. In particular, two novel filtering techniques are proposed to effectively and efficiently remove data points from verification. Our comprehensive experiments based on both real and synthetic data demonstrate the efficiency and scalability of our techniques.
6.Efficient Fuzzy Type-Ahead Search in XML Data
Abstract
In a traditional keyword-search system over XML data, a user composes a keyword query, submits it to the system, and retrieves relevant answers. In the case where the user has limited knowledge about the data, often the user feels “left in the dark” when issuing queries, and has to use a try-and-see approach for finding information. In this paper, we study fuzzy type-ahead search in XML data, a new information-access paradigm in which the system searches XML data on the fly as the user types in query keywords. It allows users to explore data as they type, even in the presence of minor errors of their keywords. Our proposed method has the following features: 1) Search as you type: It extends Autocomplete by supporting queries with multiple keywords in XML data. 2) Fuzzy: It can find high-quality answers that have keywords matching query keywords approximately. 3) Efficient: Our effective index structures and searching algorithms can achieve a very high interactive speed. We study research challenges in this new search
framework. We propose effective index structures and top-k algorithms to achieve a high interactive speed. We examine effective ranking functions and early termination techniques to progressively identify the top-k relevant answers. We have implemented our method on real data sets, and the experimental results show that our method achieves high search efficiency and result quality.
7.Footprint: Detecting Sybil Attacks in Urban Vehicular Networks
Abstract
In urban vehicular networks, where privacy, especially the location privacy of anonymous vehicles is highly concerned, anonymous verification of vehicles is indispensable. Consequently, an attacker who succeeds in forging multiple hostile identifies can easily launch a Sybil attack, gaining a disproportionately large influence. In this paper, we propose a novel Sybil attack detection mechanism, Footprint, using the trajectories of vehicles for identification while still preserving their location privacy. More specifically, when a vehicle approaches a road-side unit (RSU), it actively demands an authorized message from the RSU as the proof of the appearance time at this RSU. We design a location-hidden authorized message generation scheme for two objectives: first, RSU signatures on messages are signer ambiguous so that the RSU location information is concealed from the resulted authorized message; second, two authorized messages signed by the same RSU within the same given period of time (temporarily linkable) are recognizable so that they can be used for identification. With the temporal limitation on the linkability of two authorized messages,
authorized messages used for long-term identification are prohibited. With this scheme, vehicles can generate a location-hidden trajectory for location-privacy-preserved identification by collecting a consecutive series of authorized messages. Utilizing social relationship among trajectories according to the similarity definition of two trajectories, Footprint can recognize and therefore dismiss “communities” of Sybil trajectories. Rigorous security analysis and extensive trace-driven simulations demonstrate the efficacy of Footprint.
8.Handwritten Chinese Text Recognition by Integrating Multiple Contexts
Abstract
This paper presents an effective approach for the offline recognition of unconstrained handwritten Chinese texts. Under the general integrated segmentation-and-recognition framework with character oversegmentation, we investigate three important issues: candidate path evaluation, path search, and parameter estimation. For path evaluation, we combine multiple contexts (character recognition scores, geometric and linguistic contexts) from the Bayesian decision view, and convert the classifier outputs to posterior probabilities via confidence transformation. In path search, we use a refined beam search algorithm to improve the search efficiency and, meanwhile, use a candidate character augmentation strategy to improve the recognition accuracy. The combining weights of the path evaluation function are optimized by supervised learning using a Maximum Character Accuracy criterion. We evaluated the recognition performance on a Chinese handwriting database CASIA-HWDB, which contains nearly four million character samples o7,356 classes and 5,091 pages of unconstrained handwritten texts. The experimental results show that confidence transformation and combining multiple contexts improve the text line recognition performance significantly. On a test set of 1,015 handwritten pages, the proposed approach achieved character-level accurate rate of 90.75 percent and correct rate of 91.39 percent, which are superior by far to the best results reported in the literature.
9.Mining Web Graphs for Recommendations
Abstract
As the exponential explosion of various contents generated on the Web, Recommendation techniques have become increasingly indispensable. Innumerable different kinds of recommendations are made on the Web every day, including movies, music, images, books recommendations, query suggestions, tags recommendations, etc. No matter what types of data sources are used for the recommendations, essentially these data sources can be modeled in the form of various types of graphs. In this paper, aiming at providing a general framework on mining Web graphs for recommendations, 1) we first propose a novel diffusion method which propagates similarities between different nodes and generates recommendations; 2) then we illustrate how to generalize different recommendation problems into our graph diffusion framework. The proposed framework can be utilized in many recommendation tasks on the World Wide Web, including query suggestions, tag recommendations, expert finding, image recommendations, image annotations, etc. The experimental analysis on large data sets shows the promising future of our work.
10.Multiple Exposure Fusion for High Dynamic Range Image Acquisition
Abstract:
A multiple exposure fusion to enhance the dynamic range of an image is proposed. The construction of high dynamic range images (HDRI) is performed by combining multiple images taken with different exposures and estimating the irradiance value for each pixel. This is a common process for HDRI acquisition. During this process, displacements of
the images caused by object movements often yield motion blur and ghosting artifacts. To address the problem, this paper presents an efficient and accurate multiple exposure fusion technique for the HDRI acquisition. Our method estimates displacements, occlusion and saturated regions simultaneously by using MAP(Maximum a Posteriori)
estimation, and constructs motion blur free HDRIs. We also propose a new weighting scheme for the multiple image fusion. We demonstrate that our HDRI acquisition algorithm is accurate even for images with large motion.
11.On Optimizing Overlay Topologies for Search in Unstructured Peer-to-Peer Networks
Abstract
Unstructured peer-to-peer (P2P) file-sharing networks are popular in the mass market. As the peers participating in unstructured networks interconnect randomly, they rely on flooding query messages to discover objects of interest and thus introduce remarkable network traffic. Empirical measurement studies indicate that the peers in P2P networks have similar preferences, and have recently proposed unstructured P2P networks that organize participating peers by exploiting their similarity. The resultant network may not perform searches efficiently and effectively because existing overlay topology construction algorithms often create unstructured P2P networks without performance guarantees. Thus, we propose a novel overlay formation algorithm for unstructured P2P networks. Based on the file sharing pattern exhibiting the power-law property, our proposal is unique in that it poses rigorous performance guarantees. Theoretical performance results conclude that in a constant probability, 1) searching an object in our proposed network efficiently takes Oðlnc NÞ hops (where c is a small constant), and 2) the search progressively and effectively exploits
the similarity of peers. In addition, the success ratio of discovering an object approximates 100 percent. We validate our theoretical analysis and compare our proposal to competing algorithms in simulations. Based on the simulation results, our proposal clearly outperforms the competing algorithms in terms of 1) the hop count of routing a query message, 2) the successful ratio of resolving a query, 3) the number of messages required for resolving a query, and 4) the message overhead for maintaining and formatting the overlay.
12.Publishing Search Logs—A Comparative Study of Privacy Guarantees
Abstract
Search engine companies collect the “database of intentions,” the histories of their users’ search queries. These search logs are a gold mine for researchers. Search engine companies, however, are wary of publishing search logs in order not to disclose sensitive information. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then demonstrate that the stronger guarantee ensured by _-differential privacy unfortunately does not provide any utility for this problem. We then propose an algorithm ZEALOUS and show how to set its parameters to achieve ð_; _Þ-probabilistic privacy. We also contrast our analysis of ZEALOUS with an analysis by Korolova et al. [17] that achieves ð_0; _0Þ-indistinguishability. Our paper concludes with a large experimental study usingreal applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that ZEALOUS yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees.
13.Query Planning for Continuous Aggregation Queries over a Network of Data Aggregator
Abstract:
Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client specifies a coherency requirement as part of the query. We present a low-cost, scalable technique to answer continuous aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic webpage are served by one or more nodes of a content distribution network, our technique involves decomposing a client query into subqueries and executing subqueries on judiciously chosen data aggregators with their individual subquery incoherency bounds. We provide a technique for getting the optimal set of subqueries with their incoherency bounds which satisfies client query’s coherency requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh messages, we build a query cost model which can be used to estimate the number of messages required to satisfy the client specified incoherency bound. Performance results using real-world traces show that our cost-based query planning leads to queries being executed using less than one third the number of messages required by existing schemes.
14 Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption
Abstract:
Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to unauthorized parties. To assure the patients’ control over acce to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access and efficient user revocation, have remained the most important challenges toward achieving fine-grained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semi-trusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute based encryption (ABE) techniques to encrypt each patient’s PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multi-authority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability and efficiency of our proposed scheme.
15.Clustering with Multiviewpoint-Based Similarity Measure
Abstract
All clustering methods have to assume some cluster relationship among the data objects that they are applied on. Similarity
between a pair of objects can be defined either explicitly or implicitly. In this paper, we introduce a novel multiviewpoint-based similarity measure and two related clustering methods. The major difference between a traditional dissimilarity/similarity measure and ours is that the former uses only a single viewpoint, which is the origin, while the latter utilizes many different viewpoints, which are objects assumed to not be in the same cluster with the two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. Theoretical analysis and empirical study are conducted to support this claim. Two criterion functions for document clustering are proposed based on this new measure. We compare them with several well-known clustering algorithms that use other popular similarity measures on various document collections to verify the advantages of our proposal. P&DS
Abstract:
Although random access operations are desirable for on-demand video streaming in peer-to-peer systems, they are difficult to efficiently achieve due to the asynchronous interactive behaviors of users and the dynamic nature of peers. In this paper, we propose a network coding equivalent content distribution (NCECD) scheme to efficiently handle interactive video-on-demand (VoD) operations in peer-to-peer systems. In NCECD, videos are divided into segments that are then further divided into blocks. These blocks are encoded into independent blocks that are distributed to different peers for local storage. With NCECD, a new client only needs to connect to a sufficient number of parent peers to be able to view the whole video and rarely needs to find new parents when performing random access operations. In most existing methods, a new client must search for parent peers containing specific segments; however, NCECD uses the properties of network coding to cache quivalent content in peers, so that one can pick any parent without additional searches. Experimental results show that the proposed scheme achieves low startup and jump searching delays and requires fewer server resources. In addition, we present the analysis of system parameters to achieve reasonable block loss rates for the proposed scheme
2.Catching Packet Droppers and Modifiers in Wireless Sensor Networks
Abstract
Packet dropping and modification are common attacks that can be launched by an adversary to disrupt communication in wireless multihop sensor networks. Many schemes have been proposed to mitigate or tolerate such attacks, but very few can effectively and efficiently identify the intruders. To address this problem, we propose a simple yet effective scheme, which can identify misbehaving forwarders that drop or modify packets. Extensive analysis and simulations have been conducted to verify the effectiveness and efficiency of the scheme.
3.Cooperative Provable Data Possession for Integrity Verification in Multi-Cloud Storage
Abstract
Provable data possession (PDP) is a technique for ensuring the integrity of data in storage outsourcing. In this paper, we address the construction of an efficient PDP scheme for distributed cloud storage to support the scalability of service and data migration, in which we consider the existence of multiple cloud service providers to cooperatively store and maintain the clients’ data. We present a cooperative PDP (CPDP) scheme based on homomorphic verifiable response and hash index hierarchy. We prove the security of our scheme based on multi-prover zero-knowledge proof system, which can satisfy completeness, knowledge soundness, and zero-knowledge properties. In addition, we articulate performance optimization mechanisms for our
scheme, and in particular present an efficient method for selecting optimal parameter values to minimize the computation costs of clients and storage service providers. Our experiments show that our solution introduces lower computation and communication overheads in comparison with non-cooperative approaches.
4.Distributed Packet Buffers for High-Bandwidth Switches and Routers
Abstract
High-speed routers rely on well-designed packet buffers that support multiple queues, provide large capacity and short response times. Some researchers suggested combined SRAM/DRAM hierarchical buffer architectures to meet these challenges. However, these architectures suffer from either large SRAM requirement or high time-complexity in the memory management. In this paper, we present scalable, efficient, and novel distributed packet buffer architecture. Two fundamental issues need to be addressed to make this architecture feasible: 1) how to minimize the overhead of an individual packet buffer; and 2) how to design scalable packet buffers using independent buffer subsystems. We address these issues by first designing an efficient compact buffer that reduces the SRAM size requirement by (k _ 1Þ=k. Then, we introduce a feasible way of coordinating multiple subsystems with a load-balancing algorithm that maximizes the overall system performance. Both theoretical analysis and experimental results demonstrate that our load-balancing algorithm and the distributed packet buffer architecture can easily scale to meet the buffering needs of high bandwidth links and satisfy the requirements of scale and support for multiple queues.
5.Efficient Computation of Range Aggregates against Uncertain Location-Based Queries
Abstract
In many applications, including location-based services, queries may not be precise. In this paper, we study the problem of efficiently computing range aggregates in a multidimensional space when the query location is uncertain. Specifically, for a query point Q whose location is uncertain and a set S of points in a multidimensional space, we want to calculate the aggregate (e.g., count, average and sum) over the subset S0 of S such that for each p 2 S0, Q has at least probability _ within the distance _ to p. We propose novel, efficient techniques to solve the problem following the filtering-and-verification paradigm. In particular, two novel filtering techniques are proposed to effectively and efficiently remove data points from verification. Our comprehensive experiments based on both real and synthetic data demonstrate the efficiency and scalability of our techniques.
6.Efficient Fuzzy Type-Ahead Search in XML Data
Abstract
In a traditional keyword-search system over XML data, a user composes a keyword query, submits it to the system, and retrieves relevant answers. In the case where the user has limited knowledge about the data, often the user feels “left in the dark” when issuing queries, and has to use a try-and-see approach for finding information. In this paper, we study fuzzy type-ahead search in XML data, a new information-access paradigm in which the system searches XML data on the fly as the user types in query keywords. It allows users to explore data as they type, even in the presence of minor errors of their keywords. Our proposed method has the following features: 1) Search as you type: It extends Autocomplete by supporting queries with multiple keywords in XML data. 2) Fuzzy: It can find high-quality answers that have keywords matching query keywords approximately. 3) Efficient: Our effective index structures and searching algorithms can achieve a very high interactive speed. We study research challenges in this new search
framework. We propose effective index structures and top-k algorithms to achieve a high interactive speed. We examine effective ranking functions and early termination techniques to progressively identify the top-k relevant answers. We have implemented our method on real data sets, and the experimental results show that our method achieves high search efficiency and result quality.
7.Footprint: Detecting Sybil Attacks in Urban Vehicular Networks
Abstract
In urban vehicular networks, where privacy, especially the location privacy of anonymous vehicles is highly concerned, anonymous verification of vehicles is indispensable. Consequently, an attacker who succeeds in forging multiple hostile identifies can easily launch a Sybil attack, gaining a disproportionately large influence. In this paper, we propose a novel Sybil attack detection mechanism, Footprint, using the trajectories of vehicles for identification while still preserving their location privacy. More specifically, when a vehicle approaches a road-side unit (RSU), it actively demands an authorized message from the RSU as the proof of the appearance time at this RSU. We design a location-hidden authorized message generation scheme for two objectives: first, RSU signatures on messages are signer ambiguous so that the RSU location information is concealed from the resulted authorized message; second, two authorized messages signed by the same RSU within the same given period of time (temporarily linkable) are recognizable so that they can be used for identification. With the temporal limitation on the linkability of two authorized messages,
authorized messages used for long-term identification are prohibited. With this scheme, vehicles can generate a location-hidden trajectory for location-privacy-preserved identification by collecting a consecutive series of authorized messages. Utilizing social relationship among trajectories according to the similarity definition of two trajectories, Footprint can recognize and therefore dismiss “communities” of Sybil trajectories. Rigorous security analysis and extensive trace-driven simulations demonstrate the efficacy of Footprint.
8.Handwritten Chinese Text Recognition by Integrating Multiple Contexts
Abstract
This paper presents an effective approach for the offline recognition of unconstrained handwritten Chinese texts. Under the general integrated segmentation-and-recognition framework with character oversegmentation, we investigate three important issues: candidate path evaluation, path search, and parameter estimation. For path evaluation, we combine multiple contexts (character recognition scores, geometric and linguistic contexts) from the Bayesian decision view, and convert the classifier outputs to posterior probabilities via confidence transformation. In path search, we use a refined beam search algorithm to improve the search efficiency and, meanwhile, use a candidate character augmentation strategy to improve the recognition accuracy. The combining weights of the path evaluation function are optimized by supervised learning using a Maximum Character Accuracy criterion. We evaluated the recognition performance on a Chinese handwriting database CASIA-HWDB, which contains nearly four million character samples o7,356 classes and 5,091 pages of unconstrained handwritten texts. The experimental results show that confidence transformation and combining multiple contexts improve the text line recognition performance significantly. On a test set of 1,015 handwritten pages, the proposed approach achieved character-level accurate rate of 90.75 percent and correct rate of 91.39 percent, which are superior by far to the best results reported in the literature.
9.Mining Web Graphs for Recommendations
Abstract
As the exponential explosion of various contents generated on the Web, Recommendation techniques have become increasingly indispensable. Innumerable different kinds of recommendations are made on the Web every day, including movies, music, images, books recommendations, query suggestions, tags recommendations, etc. No matter what types of data sources are used for the recommendations, essentially these data sources can be modeled in the form of various types of graphs. In this paper, aiming at providing a general framework on mining Web graphs for recommendations, 1) we first propose a novel diffusion method which propagates similarities between different nodes and generates recommendations; 2) then we illustrate how to generalize different recommendation problems into our graph diffusion framework. The proposed framework can be utilized in many recommendation tasks on the World Wide Web, including query suggestions, tag recommendations, expert finding, image recommendations, image annotations, etc. The experimental analysis on large data sets shows the promising future of our work.
10.Multiple Exposure Fusion for High Dynamic Range Image Acquisition
Abstract:
A multiple exposure fusion to enhance the dynamic range of an image is proposed. The construction of high dynamic range images (HDRI) is performed by combining multiple images taken with different exposures and estimating the irradiance value for each pixel. This is a common process for HDRI acquisition. During this process, displacements of
the images caused by object movements often yield motion blur and ghosting artifacts. To address the problem, this paper presents an efficient and accurate multiple exposure fusion technique for the HDRI acquisition. Our method estimates displacements, occlusion and saturated regions simultaneously by using MAP(Maximum a Posteriori)
estimation, and constructs motion blur free HDRIs. We also propose a new weighting scheme for the multiple image fusion. We demonstrate that our HDRI acquisition algorithm is accurate even for images with large motion.
11.On Optimizing Overlay Topologies for Search in Unstructured Peer-to-Peer Networks
Abstract
Unstructured peer-to-peer (P2P) file-sharing networks are popular in the mass market. As the peers participating in unstructured networks interconnect randomly, they rely on flooding query messages to discover objects of interest and thus introduce remarkable network traffic. Empirical measurement studies indicate that the peers in P2P networks have similar preferences, and have recently proposed unstructured P2P networks that organize participating peers by exploiting their similarity. The resultant network may not perform searches efficiently and effectively because existing overlay topology construction algorithms often create unstructured P2P networks without performance guarantees. Thus, we propose a novel overlay formation algorithm for unstructured P2P networks. Based on the file sharing pattern exhibiting the power-law property, our proposal is unique in that it poses rigorous performance guarantees. Theoretical performance results conclude that in a constant probability, 1) searching an object in our proposed network efficiently takes Oðlnc NÞ hops (where c is a small constant), and 2) the search progressively and effectively exploits
the similarity of peers. In addition, the success ratio of discovering an object approximates 100 percent. We validate our theoretical analysis and compare our proposal to competing algorithms in simulations. Based on the simulation results, our proposal clearly outperforms the competing algorithms in terms of 1) the hop count of routing a query message, 2) the successful ratio of resolving a query, 3) the number of messages required for resolving a query, and 4) the message overhead for maintaining and formatting the overlay.
12.Publishing Search Logs—A Comparative Study of Privacy Guarantees
Abstract
Search engine companies collect the “database of intentions,” the histories of their users’ search queries. These search logs are a gold mine for researchers. Search engine companies, however, are wary of publishing search logs in order not to disclose sensitive information. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then demonstrate that the stronger guarantee ensured by _-differential privacy unfortunately does not provide any utility for this problem. We then propose an algorithm ZEALOUS and show how to set its parameters to achieve ð_; _Þ-probabilistic privacy. We also contrast our analysis of ZEALOUS with an analysis by Korolova et al. [17] that achieves ð_0; _0Þ-indistinguishability. Our paper concludes with a large experimental study usingreal applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that ZEALOUS yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees.
13.Query Planning for Continuous Aggregation Queries over a Network of Data Aggregator
Abstract:
Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client specifies a coherency requirement as part of the query. We present a low-cost, scalable technique to answer continuous aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic webpage are served by one or more nodes of a content distribution network, our technique involves decomposing a client query into subqueries and executing subqueries on judiciously chosen data aggregators with their individual subquery incoherency bounds. We provide a technique for getting the optimal set of subqueries with their incoherency bounds which satisfies client query’s coherency requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh messages, we build a query cost model which can be used to estimate the number of messages required to satisfy the client specified incoherency bound. Performance results using real-world traces show that our cost-based query planning leads to queries being executed using less than one third the number of messages required by existing schemes.
14 Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption
Abstract:
Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to unauthorized parties. To assure the patients’ control over acce to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access and efficient user revocation, have remained the most important challenges toward achieving fine-grained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semi-trusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute based encryption (ABE) techniques to encrypt each patient’s PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multi-authority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability and efficiency of our proposed scheme.
15.Clustering with Multiviewpoint-Based Similarity Measure
Abstract
All clustering methods have to assume some cluster relationship among the data objects that they are applied on. Similarity
between a pair of objects can be defined either explicitly or implicitly. In this paper, we introduce a novel multiviewpoint-based similarity measure and two related clustering methods. The major difference between a traditional dissimilarity/similarity measure and ours is that the former uses only a single viewpoint, which is the origin, while the latter utilizes many different viewpoints, which are objects assumed to not be in the same cluster with the two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. Theoretical analysis and empirical study are conducted to support this claim. Two criterion functions for document clustering are proposed based on this new measure. We compare them with several well-known clustering algorithms that use other popular similarity measures on various document collections to verify the advantages of our proposal. P&DS
You can download the project topics from here:
[attachment=25353][attachment=25354]