21-05-2014, 04:56 PM
Web Search Engines
Web Search.docx (Size: 151.99 KB / Downloads: 17)
Definition:
A web search engine is a software system that is designed to search for information on the World Wide Web. The search results are generally presented in a line of results often referred to as search engine results pages (SERPs). The information may be a specialist in web pages, images, information and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories, which are maintained only by human editors, search engines also maintain real-time information by running an algorithm on a web crawler.
History:
During early development of the web, there was a list of webservers edited by Tim Berners-Lee and hosted on the CERN webserver. One historical snapshot of the list in 1992 remains,[1] but as more and more webservers went online the central list could no longer keep up. On the NCSA site, new servers were announced under the title "What's New!"[2]
The very first tool used for searching on the Internet was Archie.[3] The name stands for "archive" without the "v". It was created in 1990 by Alan Emtage, Bill Heelan and J. Peter Deutsch, computer science students at McGill University in Montreal. The program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a searchable database of file names; however, Archie did not index the contents of these sites since the amount of data was so limited it could be readily searched manually.
The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search programs, Veronica and Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica (Very EasyRodent-Oriented Net-wide Index to Computerized Archives) provided a keyword search of most Gopher menu titles in the entire Gopher listings.
Jughead (Jonzy'sUniversal Gopher Hierarchy Excavation And Display) was a tool for obtaining menu information from specific Gopher servers. While the name of the search engine "Archie" was not a reference to the Archie comic book series, "Veronica" and "Jughead" are characters in the series, thus referencing their predecessor.
Search Engine Bias
Although search engines are programmed to rank websites based on their popularity and relevancy, empirical studies indicate various political, economic, and social biases in the information they provide. [20] [21] These biases can be a direct result of economic and commercial processes (e.g., companies that advertise with a search engine can become also more popular in its organic searchresults), and political processes (e.g., the removal of search results to comply with local laws).
Customized and Filtered Results
Many search engines such as Google and Bing provide customized results based on the user's activity history. This leads to an effect that has been called a filter bubble. The term describes a phenomenon in which websites use algorithms to selectively guess what information a user would like to see, based on information about the user (such as location, past click behavior and search history).
As a result, websites tend to show only information that agrees with the user's past viewpoint, effectively isolating the user in a bubble that tends to exclude contrary information. Prime examples are Google's personalized search results and Facebook's personalized news stream. According to Eli Parser, who coined the term, users get less exposure to conflicting viewpoints and are isolated intellectually in their own informational bubble.
Since this problem has been identified, competing search engines have emerged that seek to avoid this problem by not tracking or "bubbling" users.
Search algorithm
In computer science, a search algorithm is an algorithm for finding an item with specified properties among a collection of items. The items may be stored individually as records in a database; or may be elements of a search space defined by a mathematical formula or procedure, such as the roots of an equation with integer variables; or a combination of the two, such as the Hamiltonian circuits of agraph.
For virtual search spaces
Algorithms for searching virtual spaces are used in constraint satisfaction problem, where the goal is to find a set of value assignments to certain variables that will satisfy specific mathematical equations and in equations. They are also used when the goal is to find a variable assignment that will maximize or minimize a certain function of those variables. Algorithms for these problems include the basic brute-force search (also called "naïve" or "uninformed" search), and a variety of heuristics that try to exploit partial knowledge about structure of the space, such as linear relaxation, constraint generation, and constraint propagation.