TY - GEN
T1 - Content-based relevance estimation on the web using inter-document similarities
AU - Raiber, Fiana
AU - Kurland, Oren
AU - Tennenholtz, Moshe
PY - 2012
Y1 - 2012
N2 - In adversarial and noisy search settings as the Web, the document-query surface level similarity can be a highly misleading relevance signal. Thus, devising content-based relevance estimation (ranking) approaches becomes highly challenging. We address this challenge using two methods that utilize inter-document similarities in an initially retrieved list. The first removes documents from the list that exhibit high query similarity, but for which there is insufficient additional support for relevance that is based on inter-document similarities. The method is based on a probabilistic model that decouples document-query similarities from relevance estimation. The second method re-ranks the list by "rewarding" documents that exhibit high similarity both to the query and to other documents in the list. Both methods incorporate, in addition, at the model level, query-independent document quality estimates. Extensive empirical evaluation demonstrates the merits of our methods.
AB - In adversarial and noisy search settings as the Web, the document-query surface level similarity can be a highly misleading relevance signal. Thus, devising content-based relevance estimation (ranking) approaches becomes highly challenging. We address this challenge using two methods that utilize inter-document similarities in an initially retrieved list. The first removes documents from the list that exhibit high query similarity, but for which there is insufficient additional support for relevance that is based on inter-document similarities. The method is based on a probabilistic model that decouples document-query similarities from relevance estimation. The second method re-ranks the list by "rewarding" documents that exhibit high similarity both to the query and to other documents in the list. Both methods incorporate, in addition, at the model level, query-independent document quality estimates. Extensive empirical evaluation demonstrates the merits of our methods.
KW - inter-document similarities
KW - web search
UR - http://www.scopus.com/inward/record.url?scp=84871042710&partnerID=8YFLogxK
U2 - 10.1145/2396761.2398514
DO - 10.1145/2396761.2398514
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84871042710
SN - 9781450311564
T3 - ACM International Conference Proceeding Series
SP - 1769
EP - 1773
BT - CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
T2 - 21st ACM International Conference on Information and Knowledge Management, CIKM 2012
Y2 - 29 October 2012 through 2 November 2012
ER -