Title in audio
ASTER Demonstration
is dedicated to my
Guide-Dog.
AsTeR --Audio System For Technical Readings-- is a computing system for
rendering technical documents in audio. AsTeR was developed by me for my
PhD. (141 pages) An audio formatted version of the thesis,
(approximately 6 hours) produced by AsTeR, is being made available by
RFB (Recordings For the Blind
as the first computer generated talking book. Here is the abstract in print, and
here is an audio formatted version.
This hypertext document demonstrates the audio renderings generated by AsTeR.
Here
is an enhanced demo using inline images.
Each example is made up of three components:
- The original LaTeX input.
- The audio formatted output produced by AsTeR.
The speech is produced by a Dectalk, and has been digitized at
8-bit mulaw
AsTeR uses stereo to render tables, an effect that is not
conveyed by the
8-bit mono encoding.
- The visually formatted version produced by LaTeX and DVIPS.
How to use this demo:
The examples in this demonstration get progressively difficult.
I suggest you go through the initial sections sequentially;
For short demos, I typically show people the first three sections, and
round it off
with the continuous fraction in section 4 and a quick overview of Faa
De Bruno's formula.
Here is the Postscript file containing all the examples, in case
you want to look over them first.
I am not placing a single file containing all of the audio examples
since this would be about 9MB.
Section 1 simple fractions and expressions.
This set of examples demonstrates the use of voice inflection and pauses to
convey grouping of sub-expressions succinctly.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Audio state varied along a dimension in audio space before rendering
sub-expressions.
Section 2 superscripts and subscripts.
To convey subscripts, superscripts, and other visual attributes, vary audio
state along a dimension that is orthogonal to (independent of ) the dimension
used to convey sub-expressions.
This will allow the nesting of these mutually independent concepts.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Section 3 Knuth's examples of fractions and exponents.
These examples are taken verbatim from the TeX Book, by Donald Knuth.
They are used in the TeX Book to demonstrate the power of the TeX layout
operators.
Notice that all of these examples comprise of the same 6 symbols, but are very
different!
AsTeR can render these as unambiguously as TeX can.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Section 4 A continued fraction.
Moving along a dimension in audio space defines a perceptibly monotonic
change. This notion of perceptible monotonicity is vital in conveying nesting.
audio
LaTeX
Postscript
Section 5 Simple School algebra.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Notice the choice of unambiguous renderings for the following expressions:
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Section 7 Trigonometric identities.
Written mathematical notation can be ambiguous and hard to recognize.
Notice the complete absence of parenthesis in some of the examples below.
AsTeR uses several heuristics to construct the correct tree structure for
these expressions.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Notice the context-specific rendering when speaking the base of the logarithm.
The renderings are chosen to reduce cognitive load; log base a of
x
as opposed to log of x to the base a
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Context-specific rendering rules allow AsTeR to interpret the superscripts as
exponents. Such interpretation is not hard-wired into the renderings; it is
fully customizable by the user.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
The first of these examples, probably the most innocuous, is also the most
difficult to recognize; it is impossible to determine the variable of
integration.
Notice that AsTeR interprets triple integrals as the nested application of the
integral operator.
A user can browse the triple integral and listen to its sub-pieces.
The integrals shown in examples 3 and 4 can trick the most experienced of
human readers into an error.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Notice that the same expression can be written in more than one way.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Section 13 Cross referenced equations.
The following section is meant to illustrate AsTeR's rendering of
cross-references, and is most effective when AsTeR is used interactively.
AsTeR enables the listener to give meaningful names to cross-referenceable
objects, and uses these names when referring to such objects in later
cross-references.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
Notice that AsTeR produces good intonational structure when speaking text that
is intermixed with mathematics.
audio
LaTeX
Postscript
Section 15 Quantified expression.
The quantifiers present an interesting challenge to AsTeR's recognizer.
audio
LaTeX
Postscript
Once again, perceptible monotonicity allows AsTeR to convey the following
deeply nested expressions succinctly.
These examples were produced with the Emacs Calculator, a full-fledged
symbolic algebra system.
AsTeR interfaces directly with this calculator, and renders the output just
as well as it can render any document.
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
-
audio
LaTeX
Postscript
AsTeR uses stereo effects to convey the two-dimensional structure of the
matrix. Rendering commences on the left, and moves progressively right as
each element of any row is spoken.
audio
LaTeX
Postscript
Section 18 Faa de Bruno's formula.
This section presents Faa De Bruno's formula, taken from Knuth's Art Of
Computer Programming, Vol. 1.
I first heard it spoken by a RFB reader on a talking book; it took 120 seconds
to speak.
Since the renderings produced by AsTeR utilize features of the audio space
not available to a human reader (I still have not met a reader who can change
the size and shape of her head as she talks:-)
the rendering takes under 80 seconds.
As you will hear soon, even this is too long; you forget the beginning by the
time you hear the end.
Later, we present rendering using variable substitution, a powerful technique
for conveying top-level structure of complex expressions.
- Notice the proper intonational structure produced for text intermixed
with mathematics.
audio
LaTeX
Postscript
- audio
LaTeX
Postscript
- Here is Faa De Bruno's formula in all its glory:-
Audio (66 seconds)
LaTeX
Postscript
AsTeR can process complex expressions like the above, and upon request,
replace complex sub-expressions with meaningful identifiers. Such renderings
convey top-level structure; the listener can then listen to the
sub-expressions separately.
Since this substitution process is performed by AsTeR, there is no LaTeX or
Postscript equivalent for the audio output in this case.
T.V. Raman raman@crl.dec.com
Last modified: Fri Aug 5 10:06:00 1994