Description:
Search sessions consist of a person presenting a query to a search engine, followed by that person examining the search
results, selecting some of those search results for further review, possibly following some series of hyperlinks, and perhaps
backtracking to previously viewed pages in the session. The series of pages selected for viewing in a search session, sometimes
called the click data, is intuitively a source of relevance feedback information to the search engine. We are interested
in how that relevance feedback can be used to improve the search results quality for all users, not just the current user. For
example, the search engine could learn which documents are frequently visited when certain search queries are given.
In this article, we address three issues related to using click data as implicit relevance feedback: (1) How click data
beyond the search results page might be more reliable than just the clicks from the search results page; (2) Whether we
can further subselect from this click data to get even more reliable relevance feedback; and (3) How the reliability of click
data for relevance feedback changes when the goal becomes finding one document for the user that completely meets their
information needs (if possible). We refer to these documents as the ones that are strictly relevant to the query.
Our conclusions are based on empirical data from a live website with manual assessment of relevance. We found that
considering all of the click data in a search session as relevance feedback has the potential to increase both precision and
recall of the feedback data. We further found that, when the goal is identifying strictly relevant documents, that it could be
useful to focus on last visited documents rather than all documents visited in a search session.