Project Central

[ Query Processing | Query Optimization | New Data Types | Indexing | Data Mining | Data Warehousing | Heterogenous DBMS | Web-based Data | Client Extensions | Applications ]

PREDATOR is meant to be used in advanced classes as an educational tool. The design document and implementation notes make it possible for students to extend the system in incremental ways. Here are some suggested projects.

Query Processing

Modify reladt/compile, reladt/optimize and reladt/execute.

New Join Algorithms: PREDATOR currently has sort-merge, tuple-nested-loops and a page-nested-loops algorithm. Add a hash-join algorithm. Also, improve the page-nested-loops algorithm to use an in-memory index. Compare the performance of the different algorithms.
Temporal Join Algorithms: Modify the sort-merge join algorithm to implement various "temporal" joins.
Aggregates: Implement hash-based aggregates and compare them with the sort-based aggregates already implemented.
OLAP: Implement a CUBE operator using sorting or hashing.
Sorting: Implement a fast sort that can be integrated with sort-merge joins for increased performance. This involves bypassing the SHORE sort routine.
Views: Implement materialized views and function caching (as a special case of materialized views).
Sampling: Implement sampling versions of all (or some subset of) the operators.

Query Optimization

KBZ Join Ordering: Add a KBZ join optimizer for join queries.
Randomized Join Ordering: Add a variety of randomized join optimizers.
Enhanced Statistics: Implement advanced statistics collection (like serial histograms) and modify selectivity estimation to use them.
:

New Data Types

Build a new E-ADT : Take your pick of several possible data types.
Feature extraction : PREDATOR already has some code for image feature extraction -- however, while all the mechanism is in place, the actual featur extraction code is not in the released codebase (it belongs to a different project). Use your own feature extraction code instead. Or try this with audio data.
Multi-resolution storage : An initial implementation of multi-resolution storage is provided for image data. Implement a more generic version that can be applied uniformly across data types.
Compressed/Encrypted storage : Build mechanisms that can be used to transparently store data in compressed or encrypted form.
:

Indexing

Feature Indexing: Explore options for indexing high-dimensional feature data.
Generalized Indexing: Incorporate GiST trees as another indexing alternative.
Specialized indexes: Build a specialized index for a particular data type (for example, signature file for documents)
:

Data Mining

Standard Mining Algorithms : Implement a set of data mining algorithms as part of a separate relational query language (in addition to SQL). PREDATOR allows multiple query languages, so simply add a data mining language.
Integrated Mining: Add a specific mining algorithm within the SQL engine and explore possibilities for query optimization.

Data Warehousing

Trigger Mechanism: Provide a basic trigger mechanism which can specify actions to be performed on updates.
View Maintenance: Implement view maintenance algorithms either on top of the trigger mechanism, or standalone.
:

Web-based Data

Semi-Structured Data: Add support for "semi-structured" data using new data types.
Web Search Engine: Build your own.Extend indexing to support inverted files on documents.
Streaming Protocols: Integrate streaming protocols like RealAudio/RealVideo for multi-media playback.

Client Extensions

Extend the Java GUI with all the standard client-side extensions.

Client-Side Caching: How much caching can you perform? What limits do browsers place on your memory size?
Client-Specific Delivery: Work the client-side characteristics (like connection speed and browser capabilities) directly into server side query processing (for example, use lower resolution data).
HTTP Server: Build a HTTP server on top of PREDATOR. HTTP is simply another language supported by a separate query processing engine.

Applications

Roll your own application, complete with transactions, queries, complex content, web-based clients and a Java GUI. Here are some sample apps we are building:

Conference Management Software: Possibly even to handle submissions to SIGMOD 98.
Digital Library: Images of rare art from national art galleries.
GIS: Geographic data from the Sequoia benchmark.

Mail user support: predator-support@cs.cornell.edu .... Back to PREDATOR Home Page