[Next] [Up] [Previous]

We refine the quasi-prefix form by adding the following subtypes. This makes recognizing and handling complex mathematical content cleaner.

We first introduce object *math subformula*, which is used to capture
subexpressions appearing within the [tex2html_wrap5306] and [tex2html_wrap5308] of La)TeX.
Object *math subformula* can be thought of as being the math equivalent
of object *text block* described in s:high-level-models.
It has the following structure:

**Attribute:**Visual attributes.**Content:**The mathematical content represented as a*math object*.

We need object *math subformula* to represent expressions of the
form:

[displaymath5302]

[displaymath5303]

In representing each of the above examples, object *math
subformula* is essential in capturing the expression to which the
overbrace/underbrace applies.

To enable recognition of written mathematics, tokens have to be appropriately classified. Our classification of tokens when processing written mathematics is inspired by appendix F of the TeX Book, [Knu84].

The symbols divide naturally into groups based on their mathematical class (Ord, Op, Bin, Rel, Open, Close, or Punct), [tex2html_wrap5310]

We introduce subtypes of object *math object* to correspond to
each token type:

**Ordinary:**TeX ord. Letters, numbers and some miscellaneous symbols.**Big operator:**TeX Op. The*large*operators that typically appear as unary operators,*e.g.,*[tex2html_wrap5312], [tex2html_wrap5314], [tex2html_wrap5316].**Binary operator:**TeX Bin. The binary operators,*e.g.,*+, [tex2html_wrap5320].**Relational operator:**TeX Rel,*e.g.,*<, [tex2html_wrap5324]. We subdivide the TeX Rel class into relational and arrow operators.-
**Arrow operators:**Arrows such as [tex2html_wrap5326], [tex2html_wrap5328]. **Mathematical function:**Plain TeX and LaTeX define [tex2html_wrap5330] etc. as macros. We introduce an object type,*mathematical function*to represent these.**Open delimiter:**TeX Open,*e.g.,*[tex2html_wrap5332], [tex2html_wrap5334].**Close delimiter:**TeX Close,*e.g.,*[tex2html_wrap5336], [tex2html_wrap5338].**Math punctuation :**TeX Punct -punctuation marks.

Written mathematical notation uses *juxtaposition* as an infix
operator. Juxtaposition, as in [tex2html_wrap5340], mostly denotes
multiplication, but can mean function application in certain contexts
-[tex2html_wrap5342]. We introduce a new operator to represent
juxtaposition, and to define it precisely, we also assert that all
mathematical variables are single letters. Thus, [tex2html_wrap5344] is represented
as the juxtaposition of three *ordinary* objects. This assertion
is not specific to our internal representation,
rather, it specifies the concrete syntax used in the electronic markup
and reflects the choice made in the design of TeX. We do allow
mathematical variables made up of more than one character, but these
should be clearly marked up as such, *e.g.,* as [tex2html_wrap5346], by
using `\mbox`

as in `$\mbox{cab}=cab$`

.

The classification of a math object is defined using the following
command:
(`define-math-classification` *token* *classification*)

In certain special cases, the predefined classification shown above
can be modified. A good example of this is recognizing a mathematical
text that consistently uses the letters [tex2html_wrap5348], [tex2html_wrap5350] and [tex2html_wrap5352] to denote
functions. Using the predefined classification, the recognizer would
treat [tex2html_wrap5354] as object *ordinary*, leading to [tex2html_wrap5356] being
represented as the juxtaposition of two objects, namely, [tex2html_wrap5358] and [tex2html_wrap5360].
Declaring [tex2html_wrap5362] to be a mathematical function by executing
(`define-math-classification` f `mathematical-function-name`)

results in occurrences of [tex2html_wrap5364] being treated as a function. Hence, [tex2html_wrap5366] is correctly recognized as a function application. Note that the correct interpretation of such notation is more important for browsing than for speaking the expression.

[Next] [Up] [Previous]

Thu Mar 9 20:10:41 EST 1995