Chavdar Botev - Lin Guo
This document describes Yet another Query Graph Model (YQGM) - a general model for representing queries. YQGM is inspired by the Query Graph Model (QGM) in the Starburst project [1], the XML Query Graph Model (XQGM) in XPERANTO [3], and the RelOperator graph model in PREDATOR [6]. YQGM is more general because it is independent of the underlying data model and can support XML, relational and other data models.
In short, the model represents the query as a graph. The nodes in the graph are called operators and they are used to perform a specific kind of a computation. The edges are called quantifiers and represent the data flow in the graph. Each quantifier takes the output produced by an operator (called the output operator) and directs it to another operator (called the input operator).
The data flow in the model is encoded as a sequence of tuples, which can be viewed as a relation. Each tuple for a given operator output has a fixed number of attributes. The value of each attribute is a sequence as defined in the underlying data model [4].
As it can be seen, the data model is quite similar to the relational model. The major difference is that the value of an attribute in our model can be quite complex - it can be even a sequence of documents represented as document nodes. Nevertheless, through the rest of the exposition we will try to draw some analogies to SQL, which we hope will make the model clearer.
Let us first brush up the general structure of a SQL statement:
FROM R1, R2, ...
WHERE Pred1 AND Pred2 ...
ORDER BY Key1, Key2, ...
GROUP BY GroupKey1, GroupKey2, ...
An YQGM Static Class Diagram (e.g. Figure
)
consists of three main types of graphical objects: classes (represented
with boxes), generalizations (represented with arrows with hollow
ends), and associations between classes (represented with lines with
possible diamond heads or "stick" arrows). For more
information about the UML language, refer to [2].
The classes describe the YQGM elements; these are the building blocks of the graph model. The associations describe how the YQGM elements relate to each other. In our model, we call them YQGM relationships. Each relationship has a name, which is specified in both of the corresponding association's ends.
The following sections discuss the general YQGM elements (in library YQGM) and relationships
grouped by type. The full YQGM class diagram is shown in section
on Figure
. The standard YQGM elements are implemented in libray
YQGM_STD (Documents for the standard YQGM elements are stored in quark2/yqgm_std/doc ).
In our implementation, each YQGM element is represented through a C++ class. The name of the class is the same as the name in the diagram.
We chose to implement relationships using template classes. There are two template classes with parameters the class of the parent element and class the children elements: Rel1N for 1-to-N relationships and Rel11 for 1-to-1 relationships. They define two subclasses, ParentAccessor and ChildrenAccessor/ChildAccesor, which represent the view on the relationship from the children's/child's perspective and the parent's perspective. The former allows access to the parent of the relationship and the latter allows access to the children/child in the relationship.
These relationships govern the way YQGM elements can be created. An element is constructed by taking as parameters all its parents in relationships where the element is a child. Therefore, the graph is always constructed top-down: first the query, then the operators, then the quantifiers, etc.
The names of the relationships are formed by the short name of the parent class (given in parenthesis in the YQGM element descriptions below), the short name of the children class, and possible additional tag identifying the relationship if there are more than one relationship between the parent and the child class.
| Relationship | Type | Role | Description
qryOpr |
|---|
The UML static class diagram for the Query class is shown on
Figure
. On this diagram (and the subsequent
class diagrams) the classes in focus are shaded. All the other related
classes are given in white boxes. Every query (without user defined
functions) in YQGM is represented
as an instance of the Query class. It consists of one or more
operators (instances of the class Operator - see
),
connected by means of quantifiers (the class Quantifier - see
), thus forming the query graph.
The operators are accessible through the qryOpr relationship. There is one designated operator, called the top operator, which produces the result of the query. It is accessible through the qryOprTop relationship.
If we go back to the analogy with SQL and the relational model, the query roughly corresponds to a compound SQL SELECT-statement with the possible use of nested sub-queries.
| Relationship | Type | Role | Description
qryOpr |
|---|
QueryWithFunction is a query with user defined functions --
FunctionQuery (see
).
The qryUsrFun relationship provides access to all the user defined functions in this query.
The class QueryWithFunction inherits from Query the qryOpr relationship to all its operators and the qryOprTop relationship to the top operator.
| Attribute | Type | Description
functionName |
|---|
| Relationship | Type | Role | Description
qryOpr |
|---|
The FunctionQuery inherits from Query the qryOpr relationship to all its opertators and the qryOprTop relationship to the top operator.
| Relationship | Type | Role | Description
qryOpr |
|---|
The operator is the primary computational element in the YQGM graph. It may or may not take input data from its input quantifiers but it always produces some output, which is accessed by the other operators through its output quantifiers.
The Operator class is an abstract class that defines the common features of all operators (see Figure 2). There are eight types of operators that differ on the type of computation (functionality) they perform:
The common features among all these types of operators are the relationships
listed in
. The qryOpr relationship allows access
to the containing query. The oprOcl relationship encompasses
the output columns (see
) of the operator that are used
to compute the values for each attribute of the output tuple. The
oprQunIpt relationship provides access to the input quantifiers
for the operator, i.e. the quantifiers that contain the input sequences
of tuples. The oprQunOpt relationship contains the output quantifiers
of the operator; these are used by other operators to access the data
produced by the operator.
Except for ParamOperator, all other specific operators are implemented in library YQGM_STD. The specifics of these operators are explained in quark2/yqgm_std/doc.
| Relationship | Type | Role | Description
qryOpr |
|---|
| Relationship | Type | Role | Description
oprQunIpt |
|---|
Instances of the Quantifier class are used to connect one operator's
output as another operator's input. The former is accessible through
the oprQunOpt relationship, and the latter is accessible through
the oprQunIpt relationship (see Figure
).
The tuples are accessible by the input operator through input columns
(see
). Each input column is used to access the value of
an output column in the output operator. The input columns can be
referred to using the qunIcl relationship.
There are three types of quantifiers: EachQuantifier, AllQuantifier and SomeQuantifier. All of the three subclasses are implemented in library YQGM_STD.
The InputColumn class allows the access to the value of an operator's output column through a quantifier.
| Relationship | Type | Role | Description
qunIcl |
|---|
The containing quantifier is accessible through the qunIcl relationship. The output column producing the value of the attribute is accessible through the oclIcl relationship. The PColumns accessing the value of the input column can be found using the iclPcl relationship.
| Relationship | Type | Role | Description
expPrt |
|---|
The Expression class is the abstract ancestor of all types
of expressions used in YQGM. Through the expPrt relationship,
it provides a parse tree (see
) that can be used to compute
the value of the expression.
There are five types of expressions. Except for OutputColumn, all the specific expressions are implemented in library YQGM_STD.
The Expression class corresponds to the expression that can be found in the different clauses in a SELECT statement in SQL.
| Relationship | Type | Role | Description
expPrt |
|---|
The output column is a kind of expression (
) used to compute
the value of a single attribute in the output tuple sequence of an
operator. This value is then accessed by some of the input columns
of the output quantifiers for the parent operator.
The oprOcl relationship provides access to the containing operator. The expPrt relationship contains the parse tree used to compute the output column's value. The oclIcl relationship gives access to the input columns referring to the values produces by this output column.
The output column is analogous to the expressions in the SELECT clause of a SELECT query.
| Relationship | Type | Role | Description
expPrt |
|---|
Parse trees can be composed by supplying a parse tree as a parameter to a parse tree node representing a function (see below). There are three types of parse trees (see also Figure 5).
The relationship expPrt points to the containing expression
object and is set only for the root node of the parse tree. The relationship
pfnPrtParent can be used to access the parent PFunction
(see
) which takes this parse tree as an input parameter.
| Relationship | Type | Role | Description
expPrt |
|---|
The PColumn class can be used to access the value of an input
column (see
) in an expression.
This class inherits from ParseTree the relationship to the
containing expression (expPrt) and the relationship to the
parent PFunction object (if any) - pfnPrtParent. The
PColumn class adds the iclPcl relationship to the input
column whose value is accessed.
The PColumn corresponds to the reference to a column of a relation in the FROM clause in a SQL statement.
| Relationship | Type | Role | Description
expPrt |
|---|
PFunction is an abstract class used to relate a set of parameters (accessible by the pfnPrtChildren relationship) to some functions (implemented by its subclasses). The parameters are other ParseTrees. This class inherits from ParseTree the relationships to the containing expression (expPrt) and the relationship to the parent PFunction object (if any) - pfnPrtParent.
This class corresponds to the function application in SQL expressions. We implement two types of PFunctions (see also Figure 5).
| Relationship | Type | Role | Description
expPrt |
|---|
http://www.distlab.dk/predator/papers/designdochtml/designdoc.html
This document was generated using the LaTeX2HTML translator Version 2002 (1.62)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html yqgm.tex -split 0 -nonavigation
The translation was initiated by Anand Bhaskar on 2005-04-12