05-12-2012, 04:50 PM
Introduction to Information Retrieval
Introduction-IR.ppt (Size: 577.5 KB / Downloads: 137)
The problem of IR
Goal = find documents relevant to an information need from a large document set
IR problem
First applications: in libraries (1950s)
ISBN: 0-201-12227-8
Author: Salton, Gerard
Title: Automatic text processing: the transformation, analysis, and retrieval of information by computer
Editor: Addison-Wesley
Date: 1989
Content: <Text>
external attributes and internal attribute (content)
Search by external attributes = Search in DB
IR: search by content
Possible approaches
1. String matching (linear search in documents)
- Slow
- Difficult to improve
2. Indexing (*)
- Fast
- Flexible to further improvement
Main problems in IR
Document and query indexing
How to best represent their contents?
Query evaluation (or retrieval process)
To what extent does a document correspond to a query?
System evaluation
How good is a system?
Are the retrieved documents relevant? (precision)
Are all the relevant documents retrieved? (recall)