CS 789 THEORY SEMINAR [home]
Speaker: Carmel Domshlak
Affiliation: Cornell University
Date: February 23, 2004
Title: Schema
Meta-matching
Abstract:
Schema matching, the process of
matching between the concepts describing the meaning of data in heterogeneous,
distributed data sources (e.g. database schemata, XML DTDs, HTML form tags,
etc) is one of the basic operations required by the process of data
integration. Recently, several tools for automatic schema matching have been
proposed and evaluated in the database community. While in many domains these
tools succeed to find the right matching between the concepts, empirical
analysis shows that there is no single algorithm that is guaranteed to succeed
in all possible domains and applications.
In this paper we introduce schema
meta-matching, a novel framework for composing an arbitrary ensemble of
algorithms for schema matching. Informally, schema meta-matching is about
computing a "consensus" ranking of alternative mappings between two
sets of concepts, given the "individual" graded rankings
provided by several algorithms for schema matching.
We begin this talk with providing
a high-level view on the area of schema matching. We formalize the problem of
schema meta-matching, and introduce several algorithmic solutions for this
problem. These algorithms vary from adaptations of some standard techniques
for general quantitative rank aggregation, to novel techniques specific to the
problem of schema matching, and to combinations of both. Throughout the talk,
we provide a comparative discussion of the alternative algorithms. Finally, we
introduce several directions for future work.
Joint work with Avigdor Gal (Technion,
Israel)