TY - JOUR
T1 - Do hard topics exist? A statistical analysis
AU - Culpepper, J. Shane
AU - Faggioli, Guglielmo
AU - Ferro, Nicola
AU - Kurland, Oren
N1 - Publisher Copyright:
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2021
Y1 - 2021
N2 - Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. Recently there has been renewed interest in the notion of “query variations” which are essentially multiple user formulations for an information need. Like many retrieval models, some queries are highly effective while others are not. In this work1, we explore the fundamental problem of studying the interaction components of an IR experimental collection. Our findings show that query formulations have a comparable effect size to the topic factor itself, which is known to be the factor with the greatest effect size in prior ANOVA studies. This suggests that topic difficulty is an artifact of the collection considered and highlights the importance of further research in understanding link between the complexity of a topic and the query rewriting in IR related tasks.
AB - Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. Recently there has been renewed interest in the notion of “query variations” which are essentially multiple user formulations for an information need. Like many retrieval models, some queries are highly effective while others are not. In this work1, we explore the fundamental problem of studying the interaction components of an IR experimental collection. Our findings show that query formulations have a comparable effect size to the topic factor itself, which is known to be the factor with the greatest effect size in prior ANOVA studies. This suggests that topic difficulty is an artifact of the collection considered and highlights the importance of further research in understanding link between the complexity of a topic and the query rewriting in IR related tasks.
UR - http://www.scopus.com/inward/record.url?scp=85115657805&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85115657805
SN - 1613-0073
VL - 2947
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 11th Italian Information Retrieval Workshop, IIR 2021
Y2 - 13 September 2021 through 15 September 2021
ER -