Menu:

Managing Metadata and Provenance for Elementary-Particle Physics

Overview

This project is a collaboration between faculty and researchers at Cornell's Wilson Laboratory and the Computer Science department.

The particle detector in Cornell's Wilson Laboratory plays a significant role in current research on the structure of matter. It is one of ten major accelerator facilities in the world today. The subatomic particles, which are produced by the collision of electrons and positrons, are studied by the multi-institutional collaboration that runs CLEO (the actual detector) and conducts research in elementary particle physics---the study of the basic building blocks of matter. Please refer to the respective home pages for more information on CLEO and the Laboratory for Elementary-Particle Physics .

CLEO produces large amounts of data which are analyzed by physicists all over the world. The data consist of raw data about particle collisions and additional information about the detector calibration when recording the collisions. A fairly complex workflow is used to clean the raw data and to refine the calibration information. This process is called reconstruction. The first goal of our project is to provide a database infrastructure for managing the metadata that are generated at different stages of the reconstruction process. This will simplify the analysis process by end users and also the reconstruction process itself.

The second goal of our project is to add support for provenance to the existing system. The final results of a CLEO particle event analysis not only depend on the actual measurements related to a particle collision event, but on numerous other factors as well. The two major factors are the detector calibration constants and the data processing software. Both can change over time as new insights are gained based on previous analyses (e.g., corrections to assumed position of measuring wires in detector, new version of software for track reconstruction, etc.). This change in data and software can affect the validity of previous results and can limit re-use of previously derived data for future analysis. In particular, users of the CLEO data are interested in the following questions:

Research Foci

People

Manuel Calimlim
Johannes Gehrke
Lawrence Gibbons
Chris Jones
Valentin Kuznetsov
Mirek Riedewald
Dan Riley
Anders Ryd
Gregory J. Sharp

Internal

Software

Detailed description, summary of current status, schemata, etc.