TY - GEN
T1 - Learning to characterize matching experts
AU - Shraga, Roee
AU - Amir, Ofra
AU - Gal, Avigdor
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - Matching is a task at the heart of any data integration process, aimed at identifying correspondences among data elements. Matching problems were traditionally solved in a semi-automatic manner, with correspondences being generated by matching algorithms and outcomes subsequently validated by human experts. Human-in-the-loop data integration has been recently challenged by the introduction of big data and recent studies have analyzed obstacles to effective human matching and validation. In this work we characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid. We provide a novel framework for characterizing matching experts that, accompanied with a novel set of features, can be used to identify reliable and valuable human experts. We demonstrate the usefulness of our approach using an extensive empirical evaluation. In particular, we show that our approach can improve matching results by filtering out inexpert matchers.
AB - Matching is a task at the heart of any data integration process, aimed at identifying correspondences among data elements. Matching problems were traditionally solved in a semi-automatic manner, with correspondences being generated by matching algorithms and outcomes subsequently validated by human experts. Human-in-the-loop data integration has been recently challenged by the introduction of big data and recent studies have analyzed obstacles to effective human matching and validation. In this work we characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid. We provide a novel framework for characterizing matching experts that, accompanied with a novel set of features, can be used to identify reliable and valuable human experts. We demonstrate the usefulness of our approach using an extensive empirical evaluation. In particular, we show that our approach can improve matching results by filtering out inexpert matchers.
KW - Data Integration
KW - Deep Learning
KW - Human in the loop
KW - Schema Matching
UR - http://www.scopus.com/inward/record.url?scp=85112865715&partnerID=8YFLogxK
U2 - 10.1109/ICDE51399.2021.00111
DO - 10.1109/ICDE51399.2021.00111
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85112865715
T3 - Proceedings - International Conference on Data Engineering
SP - 1236
EP - 1247
BT - Proceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
T2 - 37th IEEE International Conference on Data Engineering, ICDE 2021
Y2 - 19 April 2021 through 22 April 2021
ER -