01-06-2012, 03:49 PM
Effective Navigation on Query Results of Biomedical Databases
Effective Navigation on Query Results of.pdf (Size: 1.49 MB / Downloads: 35)
MOTIVATION
Exploratory queries are increasingly becoming a common
phenomenon in life sciences
e.g., search for citations on a given keyword on PubMed
These queries return too-many results, but only a small
fraction is relevant
the user ends up examining all or most of the result tuples
to find the interesting ones
Can happen when the user is unsure about what is
relevant
e.g., user is looking for articles on a broad topic: ’cancer’. . .
query returns over 2 million citations on PubMed
This phenomenon is commonly referred to as
’information-overload’
CATEGORIZATION IN INFORMATION SYSTEMS
Assumptions:
Tuples in the database are annotated with one or more
categories or concepts
The set of concepts are arranged in a concept hierarchy
Example: Each citation in PubMed is associated with
several concepts from the MeSH (Medical Subject
Headings) hierarchy, typically 12 to 20
QUERY RESULT NAVIGATION: NAIVE APPROACH
Create the Navigation Tree as follows:
Extract the set S of concepts annotating tuples in the
query result set Q
Construct the minimal sub-concept hierarchy tree T, that
covers all concepts in S
QUERY RESULT NAVIGATION: NAIVE APPROACH
Problems:
Massive size of the Navigation Tree
MeSH has over 48000 concept nodes
313 results span over 3000 of these concepts
Large number of duplicate tuples
Each tuple is annotated with 12-20 MeSH concepts
Total tuple count is over 5000
Effort required to navigate the query results increases!