Chapter 28

Object-Oriented Query Languages

Úlfar Erlingsson


Disclaimer: This survey is not finished work, and its discussion may not even reflect my current beliefs. It has been put online as a service to the web community, and should be useful as at least the references are correct!

Back to main Table of Contents
  1. Introduction
  2. Background
    1. Object Data Model
    2. Object Ownership and Sharing
    3. Query Predicates
    4. Query Side Effects
    5. Query Algebras and Calculi
    6. Divergence from SQL
    7. Proposed Query Languages
  3. Current Status
    1. ODMG-93 OQL
    2. Object SQL or SQL3
    3. Some Examples
  4. Future Directions
    1. Candidate Research Problems
    2. Facing Up to Reality
    3. Prospects
  5. Annotated References


1. Introduction

In the Object Oriented Database System Manifesto [Atk89] some of the early proponents of Object-Oriented Database Management Systems (henceforth referred to as OODBMS) stated that such systems should ``...provide the functionality of an ad hoc query language.'', and furthermore ``We do not require that it be done in the form of a query language...''.

Since those words were written this perception has completely vanished from the OODBMS community. A good query language is now recognized as one of the cornerstones of an OODBMS alongside a sound object-oriented data model with unique object identifiers (OIDs). This change of perception is primarily a result of the OODBMS community coming to two important conclusions:

In this chapter we will focus on the historical development of the modern-day OODBMS query language, pointing out the main requirements for such a language and how they differ from those of a relational query language. We then briefly describe today's status of object-oriented query languages, with emphasis on what can be arguably called the most important OODBMS query language, Object Query Language, or OQL, of the Object Database Management Group (see [Cat96]). Finally we peer into the crystal ball and speculate on possible future research directions in object-oriented query languages.


2. Background

OODBMSs became popular in the mid-eighties, as a result of the sharply increased popularity of the object-oriented programming languages. However, the database community had then already been experimenting with advanced data models and query languages for some time. These were termed semantic data models, with perhaps the most influential one being the functional model and language DAPLEX [Shi81].

Early efforts at OODBMSs, and query languages for them, centered around making particular object-oriented programming languages persistent. A good example of this is the O++ Database Programming Language [Agr93], which, not surprisingly, is an extension of C++ with support for persistent objects. These persistent languages provided support for user queries through the programming language itself, or simple preprocessor extensions of it. The queries written in these languages were therefore completely non-declarative, indeed usually involving intricate pointer traversal through object members, and completely specific to the internal representation of objects. Even so, as mentioned before, there were many which believed that this was a sufficient querying capability for OODBMSs.

Once the first generation of OODBMS systems had stabilized the OODBMS community realized that better querying capabilities were needed. The discussion in [Ber92] covers the issues recognized at that time as being important in a declarative object-oriented query language. The following subsections look at these issues in some detail.

2.1. Object Data Model

An object-oriented query language needs to be founded on a well defined object-oriented data model (see chapter 27). Since there is no such data model which is globally accepted (something which is still true today, in 1996), proper definition of the data model is a delicate and difficult task. The data model should, however, at least support the following object-oriented concepts:

2.2. Object Ownership and Sharing

Support for complex object is complicated by the concept of ownership. If one object contains a reference to another, do we consider the second object as being owned by the first, and if we do may other objects still reference the second object? Thus, for example, if E is an employee who drives vehicle V, do we delete V when we delete E, and are other employees allowed to drive V?

These issues can be resolved by annotating object references with attributes indicating whether the object is owned and/or sharable, and by controlling sharing through a rule system. However, if not done properly, this can lead to the OODBMS equivalent of invalid pointers, and a complete solution may require the OODBMS to do some sort of garbage collection.

As a final point on the object data model, ownership and sharing, the Objects versus Values issue should be mentioned. Some data models distinguish the two and provide special handling for values, with values being structures containing basic data, e.g. the two floats of a complex number. Other data models, most notably [Sar92], treat values as a special unnamed non-sharable object with no methods, providing a simple specification of the structure of a tuple.

2.3. Query Predicates

A declarative query language needs to be able to specify selection criteria using various predicates. The situation for object-oriented query languages is far more complicated than that of relational query languages, due to the much richer data model. Some of the issues for object-oriented query predicates include:

2.4. Query Side Effects

A query should be read-only, i.e. not modify the state of the database (e.g. through calls to methods with side-effects) other than in the construction of (intermediate) query-results. Relaxing this restriction and allowing updates in queries is of dubious value and greatly complicates all issues relating to queries.

Not all OODBMS systems enforce this rule in the same way. Some do not enforce it at all, allowing the user to call any methods in queries, without specifying the resulting behavior, others allow insertion but not deletion or modification, while yet others do not allow method calls in queries at all. A happy medium would seem to be to allow the evaluation of side-effect-free methods in user queries, although this requires the method implementation or definition language to be able to express side effect ``freeness''.

2.5. Query Algebras and Calculi

An object-oriented query language needs to rest on an object-oriented algebra with an object-oriented rewrite calculus. This is of utmost importance in the evaluation of the expressiveness of the query language, as well as in the optimizations of queries (see chapters 17 and 29).

An important additional constraint on an object-oriented query language, and its associated algebra and calculus, is that the language be closed. This means that the results of queries need to representable in the database, and thereby processable by other queries, allowing the nesting and recursion of queries.

It can be argued that the success of the relational database model has been largely due to its sound closed relational algebra and rewrite calculus. Therefore there is a consensus in the database community that any ``real'' OODBMS, or, more generally, any next-generation database system, needs to have both and algebra and a calculus.

There have been several algebras and calculi proposed for OODBMSs, e.g. those in [Ala89], [Str91], [Ber92], [Fer93], [Leu93], [Fer95] and [Cer95], most of which focus on algebras and calculi suitable for efficient optimization of queries (see chapter 29).

2.6. Divergence from SQL

Following is a brief summary of the differences between an OODBMS and its query language, and a Relational Database Management System (RDBMS) and its query language, e.g. SQL: As a final point it can be safely said that OODBMSs, provided with a good query language, should be able to express all queries expressable in SQL. Many of these queries will be much conceptually clearer when expressed in an OODBMS because of the added semantic information in the data model, and the extensibility of the query language.

2.7. Proposed Query Languages

There are several proposals for declarative OODBMS query languages in the literature. Some of these proposals, however, mention the lack of a recognized standard declarative query language as the main reason for their definition. It therefore is no great surprise that after the publication of the ODMG-93 OQL standard there have been few proposals for new query languages, with algebras and calculi proliferating instead.

The proposals in the literature, which, as can be seen, tend to be named OQL, include the following ones:


3. Current Status

The situation is OODBMS query languages today is very similar to that of three years ago. There is still no unified concept of an object-oriented data model, with as many as eight models being popular candidates. Most commercial and research OODBMSs (O2 and Thor being examples, respectively) are committed to supporting the ODMG-93 data model and OQL query language [Cat96]. The only other candidate for a popular OODBMS query language is SQL3, which is hampered by its SQL legacy, and its glacial standards-committee definition process.

3.1. ODMG-93 OQL

ODMG-93 OQL [Cat96] is the most important query language of today. It is based on the query language of O2 [Deu90], with extensions to support most of ANSI SQL92. In brief the notable data model and query language features, those differing from what might be expected from the earlier discussion, are the following:

3.2. Object SQL or SQL3

Object SQL or SQL3 (see this FTP site for some information) is an effort to turn ANSI SQL92 into an OODBMS query language supporting ``everything for everybody''. The standards definition process has been going on for more than three years now, and, in the great tradition of such all-encompassing standards (e.g. Ada and C++), has grown to quite monstrous proportions. SQL3 also follows in the tradition of Ada in using Abstract Data Types, with procedures with strongly typed arguments serving as methods, instead of a true object paradigm with both members and methods.

The current direction of the standards committee seems to be to make SQL3 a superset of ODMG OQL, with a slightly restricted data model, while retaining all the features of SQL92. A subset of SQL3, combined with POSTQUEL, has actually been implemented in POSTGRES95, a derivative of POSTGRES.

3.3. Some Examples

The following four figures provide examples of simple ODMG-93 OQL queries. They are from O2 Technology's online demonstration.
     SELECT car 
     FROM car IN Cars 
     WHERE car.car_manufacturer.country = "Germany"
Figure 1: All cars manufactured in Germany.

     SELECT o 
     FROM u IN Users.users,
          o IN u.orders 
     WHERE o.date.month = 4
           AND o.date.year = 1996
Figure 2: All orders placed by any user in April 1996.

     SELECT component
     FROM car_manufacturer IN CarManufacturers,
          car IN car_manufacturer.cars,
          principal_characteristic IN car.characteristics,
          component IN principal_characteristic.components 
     WHERE car_manufacturer.name == "Mercedes"
	   AND car.name == "500"
	   AND principal_characteristic.name = "Engine"
	   AND component.name like "Air*"
Figure 3: All "Air*" components from Mercedes 500 engines.

     SELECT distinct struct( n: p.name )
     FROM p IN Employees
     WHERE p.age > 50
Figure 4: All distinct names of employees over 50


4. Future Directions

In this section we briefly look at the future directions for research in OODBMS query languages. We look at some candidates for research problems, relate them to the real world and screen out those we find infeasible, and finally take a closer look at those which we find likely to be successful in face of our scrutiny.

It must first be stated that any future research in OODBMS query languages is most likely to take place in industry, rather than academia. This has historically been the case with RDBMS query languages, with SQL, QBE etc., being developed in industry and is very likely to be the case in the future for OODBMS query languages. Academic researchers, in fact, have explicitly stated (see [Car87]) that they do not want the task of developing query languages, and the quick universal acceptance of ODMG-93 OQL serves as proof of this fact. Academic research in algebras and calculi, however, will probably continue undaunted for a while, as a part of query optimization.

4.1. Candidate Research Problems

There is little initiative for innovation in the area of OODBMS query languages, at least as of yet, since OODBMS vendors are still working on the basic issues in their implementations and have not even all fully implemented ODMG-93 OQL yet. Therefore the ODMG-93 OQL, and slight variants thereof, are likely to be the query language for quite some time to come, just as SQL has been virtually unchanged the query languages for RDBMSs for a long time now.

This situation is probably not likely to change to a great extent in the near future, at least not in the conventional database setting. Users and managers of databases are used to SQL-like queries and are quite satisfied with the support OQL gives them, even if they may be unsatisfied with the speed of the query processing.

The area which may become important in research on OODBMS query languages is that of queries by end-users, people who may not even be really aware that they are accessing a database. As object-oriented techniques continue to proliferate in all areas of computing, the need to query large collections of objects will become a major issue for end users. Research in this area will be closely tied to that of research in user interfaces, since the query language will be a part of the user interface for the end users. The following is a list of possible candidate topics:

  1. A New Super-Duper OODBMS Query Language: The construction and universal acceptance of a new theoretically sound and complete OODBMS query language with more expressive power and without the inherited problems from SQL (which OQL has) is without a doubt a worthy topic.
  2. Polishing ODMG-93 OQL: There are several areas where the language is not very strong, especially in its objects-vs-values dichotomy and its lack of support for complex ownership.
  3. Connection to Rule Systems: Queries and updates are bound to trigger rules, and vice-versa. There needs to be a better synergy between the two, e.g. through a similar syntax and tight integration.
  4. Graphical Query Construction: Construction of queries via a graphical interface, either in a constructive OQL like query-tree format, or in a drag-and-drop Query-by-Example associative graphs format of some sort, is likely to be successful as an ad-hoc query builder for end users, just as QBE has been for RDBMSs.

4.2. Facing Up to Reality

When looking at these possible research topics we have to evaluate them based on several criteria, at least the following ones:
  1. Demand for the research.
  2. The possible benefits from the research.
  3. The likelihood of major discoveries.
  4. The likely time span of the research.
As mentioned before there doesn't seem to be much demand for, nor interest in new research in OODBMS query languages. There are of course several issues which need to be cleared up as OODBMSs mature and are further developed, but overall people seem happy with the current OQL derivative of SQL.

This situation forces us to eliminate candidate 1, the construction of a completely new query language. There simply isn't any motivation for it in the OODBMS community. There isn't much demand for the research, the benefits for actual users are not major, the possibility of major breakthroughs is very small and the time span of the research is probably long, since all new languages have to be implemented and used to be fully evaluated.

Candidate 2 also fails most of our criteria, but is likely to be realized even so, given its incremental nature. There will undoubtedly be future versions of ODMG OQL and those versions will be modified in ways dictated by the user community. These incremental changes will be done almost automatically as OODBMSs mature and develop.

The interfacing of query languages with rule systems, candidate 3, is an area where there isn't really much demand as of yet, since there are very few databases which provide good support for rule systems. There is however already much demand for such systems and they will therefore probably be incorporated into OODBMSs in the near future. When this happens there will undoubtedly be demand for tighter integration with the query language. There are however no major benefits to be gained from this research, nor major discoveries in sight, and a simple de-facto standard is likely to be developed and accepted over a fairly short period of time.

The last candidate is the most promising one. The demand for research on the issue may not be overwhelming at this time, but its time is undoubtedly ripe. In the near future millions of users worldwide will be working with most of their data in terms of objects, be it within application software like CAD systems or spreadsheets, or just in the file storage system. As computers become networked onto the World Wide Web they will have access to more and more information, most of which can be viewed as objects of different types and interconnections.

There are currently no really good methods for querying by end users of object-oriented information, the most popular solutions being the glorified GREP query-by-keyword systems which do full-text searches on the data. The benefit of an end-user query system which allows the user to build non-trivial queries which make use of the full object types and interconnections are tremendous.

This is likely to be a somewhat long-term research project, as user-interface research doesn't usually live up to its full potential until decades after the initial work on it started (e.g. the mouse and GUI). Also the likelihood of any major discoveries seems small, the best one could hope for is an easily used graphical object-oriented equivalent of Query By Example. Even so this is a very important research area since any advances, however small they may seem, are likely to effect the every day work of millions of users.

4.3. Prospects

As mentioned above, candidates 2 and 3 will probably almost automatically be done in the next 5 to 10 years as OODBMSs mature. The resources and backing required for them is merely the continued survival of OODBMS vendors and an active OODBMS user community. In all likelihood both candidates will have been fully resolved in 10 years time.

Candidate 3, the development of a good end-user OODBMS query mechanism, will not resolve itself by incremental changes to existing technology. Its development requires strong industrial backing and very possibly paradigm shifts from existing query mechanism. The best setting for this research and development would probably be an industrial research lab with top-notch database and user interface researchers, generous funding and well defined goals. Optimally there should be several such groups at several companies exploring different approaches.

Undoubtedly there is a better intuitive way of querying OODBMSs through graphical QBE-like interfaces than the form-based GREP utilities of today. Therefore we must conclude that the chances of the research being successful are very high. It is easy to measure this success; if the majority of end users in 5 to 10 years are using a powerful and intuitive query mechanism to perform searches on their local and non-local data our goals have been accomplished.


5. Annotated References

Agrawal, R., Dar, S. and Gehani, N., ``The O++ Database Programming Language: Implementation and Experience'', Proceedings IEEE International Conference on Data Engineering, pp. 61, Vienna, Austria, April 1993.

This paper describes the authors experience in the design, implementation and continued development of O++, written very late in the project. O++ is a good representative of very early OODBMSs.

Alashqur, A.M., Su, S.Y.W. and Lam, H., ``OQL: A Query Language for Manipulating Object-Oriented Databases'', Proceedings of the Fifteenth International Conference on Very Large Data Bases, Amsterdam, Holland, August 1989.

This paper describes an OODBMS query language, OQL (not to be confused with other OQLs), which achieves closure by returning a subdatabase as the result of queries.

Atkinson, M. P., Bancilhon, F., DeWitt, D., Dittrich, K., Maier, D., and Zdonik, S., `` The Object-Oriented Database System Manifesto'', Proc. First International Conference on Deductive and Object-Oriented Databases, Kyoto, Japan, December 1989.

This paper summarizes the (somewhat unformed) very early views and expectations of OODBMSs. The paper should not be read as a description of the current view of OODBMSs, but rather as an historical document giving insight into the motivation and early foundation of OODBMS research.

Baker, H., `` Equal Rights for Functional Objects or, The More Things Change, The More They Are The Same'', OOPS Messenger, Vol. 4, No. 4, pp. 1 - 26, October, 1993.

This paper gives a good overview of the problem of determining object equality, describes most of the previous work in the area, and gives a ``perfect'' algorithm for performing the comparison.

Bertino, E., Negri, M., Pelagatti, G. and Sbattella, L., ``Object-Oriented Query Languages: The Notion and Issues'', IEEE Transactions on Knowledge and Data Engineering, Vol. 4, No. 3, pp. 223, June 1992.

This paper is an excellent discussion of the issues involved in the design and implementation of an object-oriented query language. The paper was written at a time (1989) when the OODBMS community was just realizing the importance of query languages, even though it was not published until much later.

Carey, M.J., DeWitt, D.J., Vandenberg, S.L., `` A Data Model and Query Language for EXODUS'', Technical Report CS-TR-87-734, University of Wisconsin, Madison, 1987.

This paper describes an early attempt at a declarative query language, EXCESS, for the research OODBMS EXODUS.

The Object Database Standard: ODMG-93, Release 1.2, edited by R.G.G. Cattell, ISBN 1-55860-396-4, Morgan Kaufmann Publishers, 1996.

Contains the Object Database Management Group specification of OQL, an Object Query Language based on the query language of O2 [Deu90]. This OQL is today's (1996) definitive object-oriented query language, supported by most commercial OODBMSs, as well as most research OODBMSs, such as Thor.

Deux, O., et al., ``The Story of O2'', IEEE Transactions on Knowledge and Data Engineering, Vol. 2., No. 1, pp. 91, March 1990.

This paper describes O2 (O subscript 2), an early OODBMS system. The system has had far reaching influences on OODBMSs, most notably on the query language ODMG-93 OQL, [Cat96]. The query language of O2 is a simple closed extension of SQL allowing construction of new values as results of queries, while distinguishing between objects and values.

Sarkar, M., Reiss, S. P., `` A Data Model and A Query Language for Object-Oriented Databases'', Technical Report CS-92-57, Brown University, December 1992.

This paper presents a fairly advanced query language (and an associated algebra), called OQL (again, do not confuse this with other OQLs), which is more expressive than previous algebras while maintaining closure of query results and allowing such things as recursive queries.

Shipman, D., ``The Functional Data Model and the Data Language DAPLEX'', ACM Transactions on Database Systems, Vol. 6, No. 1, pp. 140, March 1981.

This paper describes the highly influential early semantic data model and query language DAPLEX. The ideas presented in this paper, in the setting of a functional data language, have influenced all later advanced query languages, especially object-oriented ones.