CS 789 THEORY SEMINAR [home]


Speaker:    Carmel Domshlak   
 
Affiliation:  Cornell University
Date:          February 23, 2004
Title:          Schema Meta-matching

Abstract:

Schema matching, the process of matching between the concepts describing the meaning of data in heterogeneous, distributed data sources (e.g. database schemata, XML DTDs, HTML form tags, etc)  is one of the basic operations required by the process of data integration. Recently, several tools for automatic schema matching have been proposed and evaluated in the database community. While in many domains these tools succeed to find the right matching between the concepts, empirical analysis shows that there is no single algorithm that is guaranteed to succeed in all possible domains and applications.
 
In this paper we introduce schema meta-matching, a novel framework for composing an arbitrary ensemble of algorithms for schema matching. Informally, schema meta-matching is about computing a "consensus" ranking of alternative mappings between two sets of concepts, given the "individual" graded rankings provided by several algorithms for schema matching.
 
We begin this talk with providing a high-level view on the area of schema matching. We formalize the problem of schema meta-matching, and introduce several algorithmic solutions for this problem. These algorithms vary from adaptations of some standard techniques for general quantitative rank aggregation, to novel techniques specific to the problem of schema matching, and to combinations of both. Throughout the talk, we provide a comparative discussion of the alternative algorithms. Finally, we introduce several directions for future work.
 
Joint work with Avigdor Gal (Technion, Israel)