05-12-2012, 04:05 PM
Peer-to-peer (P2P) Text Search
Peer-to-peer.doc (Size: 164 KB / Downloads: 22)
Abstract
Peer-to-peer (P2P) networks integrate autonomous computing resources without requiring a central coordinating authority, which makes them a potentially robust and scalable model for providing federated search capability to large-scale networks of text digital libraries. However, P2P networks have so far mostly used simple search techniques based on document names or controlled vocabulary terms, and provided very limited support for full-text search of document contents. This proposal provides solutions to full-text federated search with relevance-based document ranking within an integrated framework of P2P network overlay, search, and evolution models. Previous notions of P2P network architectures are extended to define a network overlay model with desired content distribution and navigability.
Introduction:
Peer-to-peer (P2P) networks integrate autonomous computing resources without requiring a central coordinating authority, which makes them a potentially robust and scalable model for providing federated search capability to large-scale networks of text digital libraries. However, P2P networks have so far mostly used simple search techniques based on document names or controlled vocabulary terms, and provided very limited support for full-text search of document contents. This proposal provides solutions to full-text integrated search with relevance-based document ranking within an integrated framework of P2P network overlay, search, and evolution models. Previous notions of P2P network architectures are extended to define a network overlay model with desired content distribution and navigability. Existing approaches to federated search are adapted, and new methods are developed for resource representation, resource selection, and result merging in a network search model according to the unique characteristics of P2P networks. We extend previous notions of P2P networks to define a P2P network overlay model with enhanced functionalities in network architecture, and desired content distribution and navigability in network topology. Based on the network architecture extended to support full-text federated search, we develop a network search model to conduct effective and efficient federated search of text digital libraries. A network evolution model is also proposed to describe how a P2P network can dynamically and autonomously evolve into one with the defined network topology to further improve search performance.
Problem Statement:
Most search techniques developed for full-text ranked retrieval assume a centralized control. Either all the documents are stored in a centralized repository, or information about all the documents is gathered at a centralized directory service. Traditional federated search (“distributed information retrieval”) only requires the aggregate directory information about each collection instead of each individual document. However, a centralized directory is still assumed to store the directory information of all the collections. A central authority for search purpose may be undesired in P2P networks due to its susceptibility to become a performance bottleneck or the target of malicious attacks, or because it requires IT infrastructure and resources that are unavailable or impractical in the environment.
Aim of the Project:
The ultimate aim of all search engines is to provide quality search results efficiently and speedily. In the project we implement the Full Text Search mechanism and test it through simulation on a single computer on a real large data collection. Then we extend it by peer to peer system.
Objective of Project:
Today, as information becomes infinitely large, there is a growing need for accurate data search over the internet. Until now we have local Full Text Search only. This is a weak point of it. The main purpose of this is to resolve this problem by extending Full Text Search using peer to peer. In the beginning of the project we implement desktop search like google one, and than extension to Network Full Text Search implemented by P2P system.
Existing System:
The majority of the previous research on search in P2P networks has focused on P2P networks used for file-sharing or distributed information storage. As a result, the search techniques developed for P2P networks have so far mostly been limited to simple matching over document names, identifiers, or keywords from a small vocabulary for limited-domain contents (Tsoumakos and Roussopoulos 2003b) (Sakaryan et al. 2004) (Li and Wu 2005). In contrast, it has already become common practice for text digital libraries developed during the last decade to perform full-text search, in which the full body of each text document is searched. In addition, term frequency information is often used to rank documents by how well they satisfy each query, and the search result is presented with some form of relevance ranking (“full-text ranked retrieval”). We argue that most of the recent research on P2P networks offers little useful guidance for providing full-text search of current text digital libraries with open-domain contents. Thus we focus on developing solutions to full-text ranked retrieval for federated search of text digital libraries in P2P networks.
Proposed Project Design:
On local search tab you can see indexes and folders tabs In indexes tab you can choose interval time between two indexations. If you want to index now you can push on the button “Indexes Now”. For indexation on the starting of application you can check box “Perform Indexation on Start”. On the “Folders” tab you can browse for the folder by pushing on the “Browse Folders”. On network search panel you can transfer the folders that you want share with another people. And eventually on the last search options you can choose score for result that under this value you will not receive.
JXTA
Project JXTA is the first attempt to formulate core P2P protocols, on top of which P2P applications could be build. The JXTA network consists of a series of interconnected nodes, or peers. A peer may be any type of device from a sensor to a supercomputer or even a virtual process. Multiple peers may run on a single physical device and, potentially, multiple physical devices could cooperate to act as a single peer. Two types of services are common within JXTA networks, peer services and group services. Each Peer Group includes as part of it's definition the set of Group services which each peer must run in order to participate in the peer group. Each peer operates independently and asynchronously from all other peers and is uniquely identified by a Peer ID. Peers publish one or more network addresses for use with the JXTA protocols. Each published address is advertised as a peer endpoint, which identifies the network address.
Scope of Project:
Search Information is a convenient corporate retrieval tool that allows for fast and convenient search in corporate network. Performs search in large data volumes, consolidating information from a multitude of sources (file formats, documents, etc.); Performs search at an exceptionally high speed regardless of the number of active workstations and amounts of data to be processed; Similar search technology with its unique feature of uncovering duplicates increases the quality of search results and significantly reduces the time spent on search session; Due to simplicity of its integration doesn’t require any changes in already existing business processes.