Disclaimer: This survey is not finished work, and its discussion may not even reflect my current beliefs. It has been put online as a service to the web community, and should be useful as at least the references are correct!
Since those words were written this perception has completely vanished from the OODBMS community. A good query language is now recognized as one of the cornerstones of an OODBMS alongside a sound object-oriented data model with unique object identifiers (OIDs). This change of perception is primarily a result of the OODBMS community coming to two important conclusions:
Early efforts at OODBMSs, and query languages for them, centered around making particular object-oriented programming languages persistent. A good example of this is the O++ Database Programming Language [Agr93], which, not surprisingly, is an extension of C++ with support for persistent objects. These persistent languages provided support for user queries through the programming language itself, or simple preprocessor extensions of it. The queries written in these languages were therefore completely non-declarative, indeed usually involving intricate pointer traversal through object members, and completely specific to the internal representation of objects. Even so, as mentioned before, there were many which believed that this was a sufficient querying capability for OODBMSs.
Once the first generation of OODBMS systems had stabilized the OODBMS
community realized that better querying capabilities were needed. The
discussion in [Ber92] covers the issues recognized at
that time as being important in a declarative object-oriented query language.
The following subsections look at these issues in some detail.
2.1. Object Data Model
An object-oriented query language needs to be founded on a well defined
object-oriented data model (see chapter
27).
Since there is no such data model which is
globally accepted (something which is still true today, in 1996), proper
definition of the data model is a delicate and difficult task. The data
model should, however, at least support the following object-oriented
concepts:
These issues can be resolved by annotating object references with attributes indicating whether the object is owned and/or sharable, and by controlling sharing through a rule system. However, if not done properly, this can lead to the OODBMS equivalent of invalid pointers, and a complete solution may require the OODBMS to do some sort of garbage collection.
As a final point on the object data model, ownership and sharing, the
Objects versus Values issue should be mentioned. Some data models
distinguish the two and provide special handling for values, with values being
structures containing basic data, e.g. the two floats of a complex number.
Other data models, most notably [Sar92], treat values
as a special unnamed non-sharable object with no methods, providing a simple
specification of the structure of a tuple.
2.3. Query Predicates
A declarative query language needs to be able to specify selection criteria
using various predicates. The situation for object-oriented query languages
is far more complicated than that of relational query languages, due to the
much richer data model. Some of the issues for object-oriented query
predicates include:
P.name().last = P.mother.name().last.
This is an important property of OODBMS queries which is cumbersome
to do in conventional relational systems (through the use of joins).
tuple(person, vehicle).
Not all OODBMS systems enforce this rule in the same way. Some do not enforce
it at all, allowing the user to call any methods in queries, without specifying
the resulting behavior, others allow insertion but not deletion or
modification, while yet others do not allow method calls in queries at
all. A happy medium would seem to be to allow the evaluation of
side-effect-free methods in user queries, although this requires the method
implementation or definition language to be able to express side effect
``freeness''.
2.5. Query Algebras and Calculi
An object-oriented query language needs to rest on an object-oriented
algebra with an object-oriented rewrite calculus. This is of utmost
importance in the evaluation of the expressiveness of the query language,
as well as in the optimizations of queries (see chapters
17 and
29).
An important additional constraint on an object-oriented query language, and its associated algebra and calculus, is that the language be closed. This means that the results of queries need to representable in the database, and thereby processable by other queries, allowing the nesting and recursion of queries.
It can be argued that the success of the relational database model has been largely due to its sound closed relational algebra and rewrite calculus. Therefore there is a consensus in the database community that any ``real'' OODBMS, or, more generally, any next-generation database system, needs to have both and algebra and a calculus.
There have been several algebras and calculi proposed for OODBMSs, e.g. those in
[Ala89],
[Str91],
[Ber92],
[Fer93],
[Leu93],
[Fer95] and
[Cer95],
most of which focus on algebras and calculi suitable for efficient
optimization of queries (see chapter 29).
2.6. Divergence from SQL
Following is a brief summary of the differences between an OODBMS and its query
language, and a Relational Database Management System (RDBMS) and its query
language, e.g. SQL:
The proposals in the literature, which, as can be seen, tend to be named OQL, include the following ones:
The current direction of the standards committee seems to be to make SQL3 a
superset of ODMG OQL, with a slightly restricted data model, while retaining
all the features of SQL92. A subset of SQL3, combined with POSTQUEL, has
actually been implemented in
POSTGRES95,
a derivative of POSTGRES.
3.3. Some Examples
The following four figures provide examples of simple ODMG-93 OQL queries.
They are from O2 Technology's
online
demonstration.
SELECT car
FROM car IN Cars
WHERE car.car_manufacturer.country = "Germany"
SELECT o
FROM u IN Users.users,
o IN u.orders
WHERE o.date.month = 4
AND o.date.year = 1996
SELECT component
FROM car_manufacturer IN CarManufacturers,
car IN car_manufacturer.cars,
principal_characteristic IN car.characteristics,
component IN principal_characteristic.components
WHERE car_manufacturer.name == "Mercedes"
AND car.name == "500"
AND principal_characteristic.name = "Engine"
AND component.name like "Air*"
SELECT distinct struct( n: p.name )
FROM p IN Employees
WHERE p.age > 50
It must first be stated that any future research in OODBMS query languages is most likely to take place in industry, rather than academia. This has historically been the case with RDBMS query languages, with SQL, QBE etc., being developed in industry and is very likely to be the case in the future for OODBMS query languages. Academic researchers, in fact, have explicitly stated (see [Car87]) that they do not want the task of developing query languages, and the quick universal acceptance of ODMG-93 OQL serves as proof of this fact. Academic research in algebras and calculi, however, will probably continue undaunted for a while, as a part of query optimization.
4.1. Candidate Research Problems
There is little initiative for innovation in the area of OODBMS query
languages, at least as
of yet, since OODBMS vendors are still working on the basic issues in their
implementations and have not even all fully implemented ODMG-93 OQL yet.
Therefore the ODMG-93 OQL, and slight variants thereof, are likely to be
the query language for quite some time to come, just as SQL has been
virtually unchanged the query languages for RDBMSs for a long time now.
This situation is probably not likely to change to a great extent in the near future, at least not in the conventional database setting. Users and managers of databases are used to SQL-like queries and are quite satisfied with the support OQL gives them, even if they may be unsatisfied with the speed of the query processing.
The area which may become important in research on OODBMS query languages is that of queries by end-users, people who may not even be really aware that they are accessing a database. As object-oriented techniques continue to proliferate in all areas of computing, the need to query large collections of objects will become a major issue for end users. Research in this area will be closely tied to that of research in user interfaces, since the query language will be a part of the user interface for the end users. The following is a list of possible candidate topics:
This situation forces us to eliminate candidate 1, the construction of a completely new query language. There simply isn't any motivation for it in the OODBMS community. There isn't much demand for the research, the benefits for actual users are not major, the possibility of major breakthroughs is very small and the time span of the research is probably long, since all new languages have to be implemented and used to be fully evaluated.
Candidate 2 also fails most of our criteria, but is likely to be realized even so, given its incremental nature. There will undoubtedly be future versions of ODMG OQL and those versions will be modified in ways dictated by the user community. These incremental changes will be done almost automatically as OODBMSs mature and develop.
The interfacing of query languages with rule systems, candidate 3, is an area where there isn't really much demand as of yet, since there are very few databases which provide good support for rule systems. There is however already much demand for such systems and they will therefore probably be incorporated into OODBMSs in the near future. When this happens there will undoubtedly be demand for tighter integration with the query language. There are however no major benefits to be gained from this research, nor major discoveries in sight, and a simple de-facto standard is likely to be developed and accepted over a fairly short period of time.
The last candidate is the most promising one. The demand for research on the issue may not be overwhelming at this time, but its time is undoubtedly ripe. In the near future millions of users worldwide will be working with most of their data in terms of objects, be it within application software like CAD systems or spreadsheets, or just in the file storage system. As computers become networked onto the World Wide Web they will have access to more and more information, most of which can be viewed as objects of different types and interconnections.
There are currently no really good methods for querying by end users of object-oriented information, the most popular solutions being the glorified GREP query-by-keyword systems which do full-text searches on the data. The benefit of an end-user query system which allows the user to build non-trivial queries which make use of the full object types and interconnections are tremendous.
This is likely to be a somewhat long-term research project, as user-interface
research doesn't usually live up to its full potential until decades after the
initial work on it started (e.g. the mouse and GUI). Also the likelihood of
any major discoveries seems small, the best one could hope for is an easily
used graphical object-oriented equivalent of Query By Example. Even so this
is a very important research area since any advances, however small they may
seem, are likely to effect the every day work of millions of users.
4.3. Prospects
As mentioned above, candidates 2 and 3 will probably almost automatically be
done in the next 5 to 10 years as OODBMSs mature. The resources and backing
required for them is merely the continued survival of OODBMS vendors and an
active OODBMS user community. In all likelihood both candidates will have
been fully resolved in 10 years time.
Candidate 3, the development of a good end-user OODBMS query mechanism, will not resolve itself by incremental changes to existing technology. Its development requires strong industrial backing and very possibly paradigm shifts from existing query mechanism. The best setting for this research and development would probably be an industrial research lab with top-notch database and user interface researchers, generous funding and well defined goals. Optimally there should be several such groups at several companies exploring different approaches.
Undoubtedly there is a better intuitive way of querying OODBMSs through graphical QBE-like interfaces than the form-based GREP utilities of today. Therefore we must conclude that the chances of the research being successful are very high. It is easy to measure this success; if the majority of end users in 5 to 10 years are using a powerful and intuitive query mechanism to perform searches on their local and non-local data our goals have been accomplished.
This paper describes the authors experience in the design, implementation and continued development of O++, written very late in the project. O++ is a good representative of very early OODBMSs.
Alashqur, A.M., Su, S.Y.W. and Lam, H., ``OQL: A Query Language for Manipulating Object-Oriented Databases'', Proceedings of the Fifteenth International Conference on Very Large Data Bases, Amsterdam, Holland, August 1989.
This paper describes an OODBMS query language, OQL (not to be confused with other OQLs), which achieves closure by returning a subdatabase as the result of queries.
Atkinson, M. P., Bancilhon, F., DeWitt, D., Dittrich, K., Maier, D., and Zdonik, S., `` The Object-Oriented Database System Manifesto'', Proc. First International Conference on Deductive and Object-Oriented Databases, Kyoto, Japan, December 1989.
This paper summarizes the (somewhat unformed) very early views and expectations of OODBMSs. The paper should not be read as a description of the current view of OODBMSs, but rather as an historical document giving insight into the motivation and early foundation of OODBMS research.
Baker, H., `` Equal Rights for Functional Objects or, The More Things Change, The More They Are The Same'', OOPS Messenger, Vol. 4, No. 4, pp. 1 - 26, October, 1993.
This paper gives a good overview of the problem of determining object equality, describes most of the previous work in the area, and gives a ``perfect'' algorithm for performing the comparison.
Bertino, E., Negri, M., Pelagatti, G. and Sbattella, L., ``Object-Oriented Query Languages: The Notion and Issues'', IEEE Transactions on Knowledge and Data Engineering, Vol. 4, No. 3, pp. 223, June 1992.
This paper is an excellent discussion of the issues involved in the design and implementation of an object-oriented query language. The paper was written at a time (1989) when the OODBMS community was just realizing the importance of query languages, even though it was not published until much later.
Carey, M.J., DeWitt, D.J., Vandenberg, S.L., `` A Data Model and Query Language for EXODUS'', Technical Report CS-TR-87-734, University of Wisconsin, Madison, 1987.
This paper describes an early attempt at a declarative query language, EXCESS, for the research OODBMS EXODUS.
The Object Database Standard: ODMG-93, Release 1.2, edited by R.G.G. Cattell, ISBN 1-55860-396-4, Morgan Kaufmann Publishers, 1996.
Contains the Object Database Management Group specification of OQL, an Object Query Language based on the query language of O2 [Deu90]. This OQL is today's (1996) definitive object-oriented query language, supported by most commercial OODBMSs, as well as most research OODBMSs, such as Thor.
Deux, O., et al., ``The Story of O2'', IEEE Transactions on Knowledge and Data Engineering, Vol. 2., No. 1, pp. 91, March 1990.
This paper describes O2 (O subscript 2), an early OODBMS system. The system has had far reaching influences on OODBMSs, most notably on the query language ODMG-93 OQL, [Cat96]. The query language of O2 is a simple closed extension of SQL allowing construction of new values as results of queries, while distinguishing between objects and values.
Sarkar, M., Reiss, S. P., `` A Data Model and A Query Language for Object-Oriented Databases'', Technical Report CS-92-57, Brown University, December 1992.
This paper presents a fairly advanced query language (and an associated algebra), called OQL (again, do not confuse this with other OQLs), which is more expressive than previous algebras while maintaining closure of query results and allowing such things as recursive queries.
Shipman, D., ``The Functional Data Model and the Data Language DAPLEX'', ACM Transactions on Database Systems, Vol. 6, No. 1, pp. 140, March 1981.
This paper describes the highly influential early semantic data model and query language DAPLEX. The ideas presented in this paper, in the setting of a functional data language, have influenced all later advanced query languages, especially object-oriented ones.