Producing audio notation for mathematics

[Next] [Up] [Previous]
Next: Parenthesizing in audio Up: Rendering mathematics Previous: Rendering mathematics

Producing audio notation for mathematics

We exploit the abstraction of the audio space to define unique audio dimensions that make up the various pieces of the notation. These dimensions can be thought of as lines[+] determined by a combination of the speech and non-speech dimensions described in c:afl. The AFL states used to produce different pieces of the audio notation are reached by ``moving'' along these dimensions. The functions used to generate new states are monotonic in the mathematical sense described in eq:monotonic.

We choose unique audio dimensions to map the quasi-prefix form into audio space. The quasi-prefix representation is a tree with attributes. We pick one audio dimension, denoted by dim-children (see fig:children), along which to vary the current AFL state as different levels of a tree are rendered. We next choose dimensions orthogonal to dim-children to cue the visual attributes as follows. Let [tex2html_wrap5658] and [tex2html_wrap5660] denote two speech-space dimensions that are orthogonal to dim-children. Select three lines in the speech space, [tex2html_wrap5662], [tex2html_wrap5664], and [tex2html_wrap5666]. Moving forward or backward along these three lines cues the six visual attributes.

Conventional mathematical notation has built up a strong association between the superscript and subscript, in that we intuitively think of them as opposites, i.e., the superscript moves up, and the subscript moves down. AsTeR takes advantage of this association by moving the AFL state ``forward'' along the line [tex2html_wrap5668] before rendering superscripts and ``backward'' along this same line before rendering subscripts. States along the line [tex2html_wrap5670] cue left superscripts and subscripts; states along [tex2html_wrap5672] cue accents and underbars. By our choice of [tex2html_wrap5674] and [tex2html_wrap5676], these variations are independent of dimension dim-children. See fig:superscript and fig:subscript for the audio dimensions that are currently used for cueing superscripting and subscripting.

[thesisfigure1389]

: Audio dimension used for rendering subtrees.

The effect of moving along the audio dimension shown in fig:children is to produce a softer, more animated voice. As deeper levels of nesting are entered, the change in voice characteristic produces a sense of falling off into the distance.

[thesisfigure1396]

: Audio dimension used for rendering superscripts.

A change along the audio dimension shown in fig:superscript produces a higher pitched voice. The change in the head size keeps the voice from sounding unpleasant. The step size along both the average-pitch and head-size dimensions are reduced. This allows unambiguous rendering of subscripts in superscripts. The change in AFL state in fig:subscript is the exact opposite of the change in fig:superscript.

[thesisfigure1406]

: Audio dimension used for rendering subscripts.

In cases where no contextual information is available, the visual attributes appearing on a math object are rendered in the following order:

Subscript.
Superscript.
Underbar.
Accent.
Left-subscript.
Left-superscript.

The above ordering is motivated by the fact that in traditional mathematical notation, the subscript binds[+] the tightest. The order in which attributes are rendered is encapsulated in Lisp variable *attributes-reading-order* and may be changed by a user.

In style simple, a commonly used rendering style, subscripts and superscripts are rendered by first moving either backwards or forwards along the audio dimensions shown in fig:superscript and fig:subscript. This produces extremely concise and unambiguous renderings. Consider the following expressions:

[equation1420]

[equation1425]

Here, a plain verbal rendering produces an unnecessarily complicated description that makes it difficult to comprehend the inherent structure present in the expression.

Here is an example to illustrate the benefits of an audio notation when rendering unusual mathematical notation. In the following, [tex2html_wrap5678] denotes addition modulo [tex2html_wrap5680]. Given this information,

[displaymath5682]

could be spoken as ``x plus mod n y plus mod n z''. However, if this information is unavailable, AsTeR can still produce a rendering that can be correctly interpreted by a listener who is aware of the fact that the

[displaymath5684]

sign can be subscripted. Further, the listener who is familiar with

[displaymath5686]

denoting modulo arithmetic can now understand the expression.

In style descriptive, new AFL states are used only if necessary when rendering superscripts and subscripts. Typically, ``x 1'' in traditional spoken math means

[displaymath5688]

. Rendering style descriptive takes advantage of this convention to avoid using new AFL states when rendering subscripts that are simple. Note, however, that by doing so, rendering style descriptive does introduce ambiguity in the renderings;

[displaymath5690]

and

[displaymath5692]

will sound the same. In our experience, we have found that this ambiguity is not a problem when rendering mathematical texts; few authors write

[displaymath5694]

in place of the preferred

[displaymath5696]

[Next] [Up] [Previous]
Next: Parenthesizing in audio Up: Rendering mathematics Previous: Rendering mathematics

TV Raman
Thu Mar 9 20:10:41 EST 1995