07-07-2012, 06:23 PM
iam a btech 4th year student and we want to do a mini project on confined web spiders. But we are unable to understand the major difference between traditional search engine like google and this as both give some results by typing a keyword. im providing the abstract so please read and make me understand the actual purpose of this project.
OBJECTIVE
The objective of the project is to develop a system that retrieves information and documents very efficiently and which limits the number of returned documents by performing an intelligent search procedure. The purpose is to design a system that displays only relevant information to the user, by suppressing unnecessary and irrelevant information.
1.2. EXISTING SYSTEM
Traditional Confined Web Spiders consult databases of the most frequently used words in documents, such as words drawn from documents title and first few sentences, hence they won't retrieve documents in which the keywords for which one is searching are buried somewhere within document. They are useful only for searching specific information in World Wide Web (WWW). Many page authors send Confined Web Spider numerous web pages containing various tricks like irrelevant title tag or repeating certain words in first few levels that are irrelevant to actual contents of the page, to boost the ratings. It might lead to situation where in not even one of the top ten sites listed would be of subject you would expect. Anyone can put up a webpage .Results can return academic results or internet gossip. HTML doesn't provide any standard method to identify contents of documents; it is extremely difficult for Confined Web Spider to identify contents of web page to index them. As World Wide Web seems to be ever expanding, with increasing threat to quality of information available on the web
1.3. PROPOSED SYSTEM
XML (extended Markup Language) is a simplified language of the mother of all document defining language, SGML (Standardized General Markup Language ) though XML is not as powerful as SGML but much easier to use . Developing web pages using XML is much similar to HTML but provides author with ability to invent their own tags, the tag names and what they mean are left to author to define depending on subject matter. The most important thing about XML is it allows more details to be included in document, searching for specific topics should become more accurate avoiding many mismatches. This application automates the process of sending queries to these websites using advanced technology and presents the search result from all the sites to the user. It is a Confined Web Spider developed for easy search. This Confined Web Spider software is developed using state of art, high calibrated. It is very much operational with current technologies and practices. In addition, the user interface provided in this application will make user / administrator more comfortable with all the complex tools at his/her easy disposal. Implementation of the Confined Web Spider software tool in any organization website is very much practical as it doesn’t demand any other external resources or components.
Thank you...[/font]