Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Seminar topic on How Search Engine Works
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Seminar topic on How Search Engine Works


[attachment=66530]


Definition

What is Search Engine?
Search engines are programs.
It search documents for specified keywords and returns a list of the documents where the keywords were found.
A search engine is really a general class of programs like Google, Bing and Yahoo! Search that enable users to search for documents on the World Wide Web


Spider or Crawler

The spider visits a web page, reads it, and then follows links to other pages within the site.
This is what it means when someone refers to a site being "spidered" or "crawled".
This is also known as “harvesting”.
The spider returns to the site on a regular basis, such as every month or two, to look for changes. Crawling the Web, following links to find pages.


The “indexer”

Everything the spider finds goes into the second part of a search engine, the index.
The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds.
If a web page changes, then this book is updated new information.
Indexing the pages to create an index from every word to every place it occurs.


What is a “Meta Search Engine”?

Meta Search Engines search more than one search engine at the same time. They can search up to 20 search engines all at once.


Disadvantages of Search Engines

They do not crawl the web in “real time”.
If a site is not linked or submitted it may not be accessible.
Not every page of a site is searchable.
Special tools needed for the Invisible/Deep Web.
Few search engines search the full text of Web pages.


How to Search Engine Works?

Search engines actually search the web to “index” all of the words on the Internet. This is a huge job! A good search engine will be able to find the exact words you are looking for.
Some search engines search only certain types of sites, like medical sites, animal sites, and so on. The results they give will be more limited, but probably more through.


How Google Works

Here I shall know how Google creates the index and the database of documents that it accesses when processing a query.

Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing.
Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing.
Google has three distinct parts


How Google Works

Googlebot, a web crawler that finds and fetches web pages.
The indexer, that sorts every word on every page and stores the resulting index of words in a huge database.
The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.


Conclusions

Search engine technology has gone through several evolutions and has finally reached the point, where Artificial Intelligence can offer tremendous help.
We have outlined a new architecture of a search engine based on robust parsing of input queries and data mining of query log data.
The advantage of the search engine include high precision and efficiency without compromising coverage.
In the future, we plan to carry out empirical studies of this type of search engine with data from realistic internet searches.