Abstract
Document retrieval methods that utilize relevance feedback often induce a single query model from the set of feedback documents, specifically, the relevant documents.We empirically show that for a few state-of-theart query-model induction methods, retrieval performance can be significantly improved by constructing the query model from a subset of the relevant documents rather than from all of them. Motivated by this finding, we propose a new approach for relevance-feedback-based retrieval. The approach, derived from the risk minimization framework, is based on utilizing multiple query models induced from all subsets of the given relevant documents. Empirical evaluation shows that the approach posts performance that is statistically significantly better than that of applying the standard practice of utilizing a single query model induced from the relevant documents. While the average relative improvements are small to moderate, the robustness of the approach is substantially higher than that of a variety of reference comparison methods that address various challenges in using relevance feedback.
Original language | English |
---|---|
Article number | 44 |
Journal | ACM Transactions on Information Systems |
Volume | 37 |
Issue number | 4 |
DOIs | |
State | Published - Oct 2019 |
Keywords
- Ad hoc retrieval
- Relevance feedback
ASJC Scopus subject areas
- Information Systems
- General Business, Management and Accounting
- Computer Science Applications