Organizing User Search Histories Report

**study tips** · 25-08-2017, 09:32 PM

Organizing User Search Histories

.doc

Organizing User Search.doc (Size: 1.88 MB / Downloads: 25)

Abstract

Users are increasingly pursuing complex task-oriented goals on the Web, such as making travel arrangements, managing finances or planning purchases. To this end, they usually break down the tasks into a few co-dependent steps and issue multiple queries around these steps repeatedly over long periods of time. To better support users in their long-term information quests on the Web, search engines keep track of their queries and clicks while searching online. In this paper, we study the problem of organizing a user’s historical queries into groups in a dynamic and automated fashion. Automatically identifying query groups is helpful for a number of different search engine components and applications, such as query suggestions, result ranking, query al- terations, sessionization, and collaborative search. In our approach, we go beyond approaches that rely on textual similarity or time thresholds, and we propose a more robust approach that leverages search query logs. We experimentally study the performance of different techniques, and showcase their potential, especially when combined together.

INTRODUCTION

S the size and richness of information on the Web grows, so does the variety and the complexity of
tasks that users try to accomplish online. Users are no longer content with issuing simple navigational queries. Various studies on query logs (e.g., Yahoo’s [1] and AltaVista’s [2]) reveal that only about 20% of queries are navigational. The rest are informational or transac- tional in nature. This is because users now pursue much broader informational and task-oriented goals such as arranging for future travel, managing their finances, or planning their purchase decisions. However, the primary means of accessing information online is still through keyword queries to a search engine. A complex task such as travel arrangement has to be broken down into a number of co-dependent steps over a period of time. For instance, a user may first search on possible destinations, timeline, events, etc.

Query (or Query Group) Relevance.

To ensure that each query group contains closely related and relevant queries and clicks, it is important to have a suitable rele- vance measure sim between the current query singleton
group sc and an existing query group si ∈ S. There
are a number of possible approaches to determine the
relevance between sc and si . Below, we outline a number
of different relevance metrics that we will later use as
baselines in experiments (see Section 5). We will also
discuss the pros and cons of such metrics as well as our
proposed approach of using search logs (see Section 3).
Time. One may assume that sc and si are somehow relevant if the queries appear close to each other in time in the user ’s history. In other words, we assume that users generally issue very similar queries and clicks within a short period of time. In this case, we define the following time-based relevance metric simtime that can be used in place of sim in Figure 3.

QUERY RELEVANCE USING SEARCH LOGS

We now develop the machinery to define the query relevance based on Web search logs. Our measure of relevance is aimed at capturing two important properties of relevant queries, namely: (1) queries that frequently appear together as reformulations and (2) queries that have induced the users to click on similar sets of pages. We start our discussion by introducing three search behavior graphs that capture the aforementioned prop- erties. Following that, we show how we can use these graphs to compute query relevance and how we can incorporate the clicks following a user ’s query in order to enhance our relevance metric.

Search Behavior Graphs

We derive three types of graphs from the search logs of a commercial search engine. The query reformulation
graph, QRG, represents the relationship between a pair
of queries that are likely reformulations of each other. The query click graph, QCG, represents the relationship
between two queries that frequently lead to clicks on similar URLs. The query fusion graph, QF G, merges the
information in the previous two graphs. All three graphs are defined over the same set of vertices VQ, consisting
of queries which appear in at least one of the graphs, but their edges are defined differently.

Query Click Graph

A different way to capture relevant queries from the search logs is to consider queries that are likely to induce users to click frequently on the same set of URLs. For example, although the queries “ipod” and “apple store” do not share any text or appear temporally close in a user ’s search history, they are relevant because they are likely to have resulted in clicks about the ipod product. In order to capture such property of relevant queries, we
construct a graph called the query click graph, QCG.
One way to identify relevant queries is to consider query reformulations that are typically found within the query logs of a search engine. If two queries that are issued consecutively by many users occur frequently enough, they are likely to be reformulations of each other. To measure the relevance between two queries issued by a user, the time-based metric, simtime, makes use of the interval between the timestamps of the queries within the user ’s search history. In contrast, our approach is defined by the statistical frequency with which two queries appear next to each other in the entire query log, over all of the users of the system.

QUERY GROUPING USING THE QF G

In this section, we outline our proposed similarity func- tion simrel to be used in the online query grouping process outlined in Section 2. For each query, we main- tain a query image, which represents the relevance of other queries to this query. For each query group, we maintain a context vector, which aggregates the images of its member queries to form an overall representation. We then propose a similarity function simrel for two query groups based on these concepts of context vectors and query images. Note that our proposed definitions of query reformulation graph, query images, and context vectors are crucial ingredients, which lend significant novelty to the Markov chain process for determining relevance between queries and query groups.

Using Search Logs

As discussed in Section 3, our query grouping algorithm relies heavily on the use of search logs in two ways: first, to construct the query fusion graph used in com- puting query relevance, and, second, to expand the set of queries considered when computing query relevance. We start our experimental evaluation, by investigating how we can make the most out of the search logs.

**jaseela123** · 02-09-2017, 04:35 PM

Users are performing increasingly complex goal-oriented tasks on the web, such as making travel arrangements, managing finances, or planning purchases. To this end, they often divide the tasks into a few codependent steps and multiple queries around these steps over and over for long periods of time. To better support users in their long-term information missions on the web, search engines track their queries and clicks while searching online. In this work, we study the problem of organizing the historical queries of a user in groups in a dynamic and automated way. Automatic query group identification is useful for a number of different search engine components and applications, such as query suggestions, results sorting, query modifications, sessions, and collaborative search. In our approach, we move beyond approaches that rely on textual similarity or time thresholds, and propose a more robust approach that leverages search query logs. We experimentally study the performance of different techniques, and show their potential, especially when combined.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Biometrics Security System Full Download Seminar Report and Paper Presentation	computer science crazy	30	190,561,110	24-02-2021, 08:13 AM Last Post: buy cialis generic
	Ultrasonic Trapping In Capillaries For Trace-Amount Bi (Download Full Seminar Report)	Computer Science Clay	2	104,277,107	17-01-2018, 11:59 AM Last Post: dhanabhagya
	nanorobotics full report	project topics	24	176,551,278	16-01-2018, 05:50 PM Last Post: Guest
	robotic surgery full report	project report tiger	16	150,961,205	07-01-2018, 07:28 PM Last Post: Raymondnof
	Human Computer Interface : Seminar Report and PPT	seminar post	1	1,337	22-09-2017, 11:23 AM Last Post: jaseela123
	4G Broadband : Seminar Report and PPT	study tips	1	1,261	22-09-2017, 11:19 AM Last Post: jaseela123
	Amoeba full report	project topics	1	1,631,984	22-09-2017, 10:38 AM Last Post: jaseela123
	Itanium Processor : Seminar Report and PPT	seminar projects maker	1	1,052	21-09-2017, 12:46 PM Last Post: jaseela123
	Design and Analysis Of Algorithms : Seminar Report and PPT	seminar projects maker	1	1,315	21-09-2017, 12:04 PM Last Post: jaseela123
	Ranked, Efficient and Secure Keyword search over encrypted cloud data PPT	seminar post	1	814	21-09-2017, 11:55 AM Last Post: jaseela123

Quick Reply
Message Type your reply to this message here. Disable Smilies	You have selected one or more posts to quote. Quote these posts now or deselect them.