Project Central
PREDATOR
is meant to be used in advanced classes as an educational tool.
The design document and implementation
notes make it possible for students to extend the system
in incremental ways. Here are some suggested projects.
Query Processing
Modify reladt/compile, reladt/optimize and
reladt/execute.
- New Join Algorithms:
PREDATOR currently has sort-merge, tuple-nested-loops
and a page-nested-loops algorithm. Add a hash-join algorithm.
Also, improve the page-nested-loops algorithm to use an
in-memory index. Compare the performance of the different
algorithms.
- Temporal Join Algorithms:
Modify the sort-merge join algorithm to implement various
"temporal" joins.
- Aggregates:
Implement hash-based aggregates and compare them with the
sort-based aggregates already implemented.
- OLAP:
Implement a CUBE operator using sorting or hashing.
- Sorting:
Implement a fast sort that can be integrated with sort-merge
joins for increased performance. This involves bypassing the
SHORE sort routine.
- Views:
Implement materialized views and function caching
(as a special case of materialized views).
- Sampling:
Implement sampling versions of all (or some subset of)
the operators.
Query Optimization
- KBZ Join Ordering:
Add a KBZ join optimizer for join queries.
- Randomized Join Ordering:
Add a variety of randomized join optimizers.
- Enhanced Statistics:
Implement advanced statistics collection (like
serial histograms) and modify selectivity estimation
to use them.
- :
New Data Types
- Build a new E-ADT :
Take your pick of several possible data types.
- Feature extraction :
PREDATOR already has some code for image feature extraction --
however, while all the mechanism is in place, the actual
featur extraction code is not in the released codebase
(it belongs to a different project). Use your own feature
extraction code instead. Or try this with audio data.
- Multi-resolution storage :
An initial implementation of multi-resolution storage is
provided for image data. Implement a more generic version
that can be applied uniformly across data types.
- Compressed/Encrypted storage :
Build mechanisms that can be used to transparently store
data in compressed or encrypted form.
- :
Indexing
- Feature Indexing:
Explore options for indexing high-dimensional feature data.
- Generalized Indexing:
Incorporate GiST trees as another indexing alternative.
- Specialized indexes:
Build a specialized index for a particular data type
(for example, signature file for documents)
- :
Data Mining
- Standard Mining Algorithms :
Implement a set of data mining algorithms as part
of a separate relational query language (in addition
to SQL). PREDATOR allows multiple query languages, so
simply add a data mining language.
- Integrated Mining:
Add a specific mining algorithm within the SQL engine
and explore possibilities for query optimization.
Data Warehousing
- Trigger Mechanism:
Provide a basic trigger mechanism which can specify actions
to be performed on updates.
- View Maintenance:
Implement view maintenance algorithms either on top of the
trigger mechanism, or standalone.
- :
Web-based Data
- Semi-Structured Data:
Add support for "semi-structured" data using new data
types.
- Web Search Engine:
Build your own.Extend indexing to support inverted files
on documents.
- Streaming Protocols:
Integrate streaming protocols like RealAudio/RealVideo
for multi-media playback.
Client Extensions
Extend the Java GUI with all the standard client-side
extensions.
- Client-Side Caching:
How much caching can you perform? What limits do browsers
place on your memory size?
- Client-Specific Delivery:
Work the client-side characteristics (like connection
speed and browser capabilities) directly into server
side query processing (for example, use lower resolution
data).
- HTTP Server:
Build a HTTP server on top of PREDATOR. HTTP is simply
another language supported by a separate query processing
engine.
Applications
Roll your own application, complete with transactions,
queries, complex content, web-based clients and
a Java GUI. Here are some sample apps we are building:
- Conference Management Software:
Possibly even to handle submissions to SIGMOD 98.
- Digital Library:
Images of rare art from national art galleries.
- GIS:
Geographic data from the Sequoia benchmark.
Mail user support: predator-support@cs.cornell.edu
.... Back to PREDATOR Home Page