Learning to characterize matching experts

Roee Shraga, Ofra Amir, Avigdor Gal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Matching is a task at the heart of any data integration process, aimed at identifying correspondences among data elements. Matching problems were traditionally solved in a semi-automatic manner, with correspondences being generated by matching algorithms and outcomes subsequently validated by human experts. Human-in-the-loop data integration has been recently challenged by the introduction of big data and recent studies have analyzed obstacles to effective human matching and validation. In this work we characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid. We provide a novel framework for characterizing matching experts that, accompanied with a novel set of features, can be used to identify reliable and valuable human experts. We demonstrate the usefulness of our approach using an extensive empirical evaluation. In particular, we show that our approach can improve matching results by filtering out inexpert matchers.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
Pages1236-1247
Number of pages12
ISBN (Electronic)9781728191843
DOIs
StatePublished - Apr 2021
Event37th IEEE International Conference on Data Engineering, ICDE 2021 - Virtual, Chania, Greece
Duration: 19 Apr 202122 Apr 2021

Publication series

NameProceedings - International Conference on Data Engineering
Volume2021-April
ISSN (Print)1084-4627

Conference

Conference37th IEEE International Conference on Data Engineering, ICDE 2021
Country/TerritoryGreece
CityVirtual, Chania
Period19/04/2122/04/21

Keywords

  • Data Integration
  • Deep Learning
  • Human in the loop
  • Schema Matching

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'Learning to characterize matching experts'. Together they form a unique fingerprint.

Cite this