02-01-2013, 01:58 PM
Boolean Retrieval
Boolean.pptx (Size: 689.5 KB / Downloads: 25)
Definition
The Boolean retrieval model is a model for information retrieval in which we can pose any query which is in the form of a Boolean expression of terms, that is, in which terms are combined with the operators and, or, and not.
Can’t build the matrix
500K x 1M matrix has half-a-trillion 0’s and 1’s.
But it has no more than one billion 1’s.
matrix is extremely sparse.
What’s a better representation?
We only record the 1 positions.
Boolean queries: Exact match
The Boolean retrieval model is being able to ask a query that is a Boolean expression:
Boolean Queries use AND, OR and NOT to join query terms
Views each document as a set of words
Is precise: document matches condition or not.
Perhaps the simplest model to build an IR system on
Primary commercial retrieval tool for 3 decades.
Many search systems you still use are Boolean:
Email, library catalog, Mac OS X Spotlight
Ranking search results
Boolean queries give inclusion or exclusion of docs.
Often we want to rank/group results
Need to measure proximity from query to each doc.
Need to decide whether docs presented to user are singletons, or a group of docs covering various aspects of the query.