Auditory User Interfaces --Foreword

By David Gries

In his award-winning Ph.D.thesis some three years ago, T.V.Raman described a computing system, called AsTeR, for rendering electronic documents aurally. Instead of having to read documents on a monitor or on paper, one can now listen to them. In fact, AsTeR's spoken math is far easier to understand than yours or mine. Moreover, the listener can browse the spoken document and have parts of it repeated ---even in a different speaking style. AsTeR allows the usually passive listener to become an active participant in the understanding of an aural rendering of a document.

Now, Raman has dramatically extended his ideas on talking computers. His computer(s) can speak more and more to him in a useful and sophisticated fashion. His aural desktop allows him to listen directly to applications as he navigates his file system, manages tasks, maintains his calendar and rolodex, edits, handles his email, browses the web, develops and debugs programs, and reads articles, memos, and books.

Raman's speech-enabling idea is to allow applications to produce aural output directly, using the same information that is used for more conventional visual output. The AUI (Auditory User Interface) works directly with the computational core of the application itself, just as the conventional GUI (Graphical User Interface) does.

In this book, you will read Raman's philosophy on user interaction. You'll learn about the shortcomings of speaking the screen and the benefits of a real AUI. You'll also see how Raman takes advantage of sophisticated facilities in emacs to implement his ideas for AUI's.

Raman actually calls for a return to simplicity, for that simplicity can be harnessed to provide more effective human-computer interaction. In the early days, there was a quite clean separation between computation and user interface ---there almost had to be since I/O was so primitive. As peripherals became more complicated, the separation became muddier and muddier. By enforcing a cleaner separation, Raman gets to build nice AUI's and make his computing environment more effective and efficient.

Raman's system adds one more dimension to human-computer interaction. I have been in the computing business for almost 40 years now, and I continue to be amazed at all these advances. Let me spend just a few paragraphs on the changes I have seen ---at the least, it may provide the younger set with a few chuckles.

I took my one and only course in computing in 1959, as a college senior. We learned how to program in a virtual assembly language. It didn't matter that the language was not real, since there weren't any computers to run our programs anyway. Forty years ago, in almost all universities and for most of the world, user-computer interaction was nonexistent.

Around 1964, I helped teach programming in Germany on a machine whose input device was a paper-tape reader. Paper tape came on a roll; holes were punched in the paper to record information. Of course, the punched card reader had also been available for years (a punched card could contain up to 80 characters of information). If you made a mistake on a punched card, you only had to retype that card, and not a whole paper tape.

The existence of punched cards did not always make the computer readily accessible. For example, at Cornell, in about 1970, the mainframe computer that ran students' programs was near the airport, some 4-5 miles away. Twice daily, the decks of punched cards to be run on the machine were trucked to the airport, and the output from execution came back four or five hours later! A year or two later, card readers hooked directly to the mainframe were placed at several locations around the campus. But even then, all through the 1970's, as the deadline for a programming assignment neared, the line of students waiting to put their cards into the card reader grew longer and longer. Sometimes, students waited one-half hour ---and then waited another hour for the program to run and the output to be printed.

In the late 1970's, in many places ``terminals'' replaced punched-card input. But the real change didn't come until personal computers were introduced ---first, on machines like the Terak and finally in about 1983 with the introduction of the Macintosh (with 256K of memory, a floppy disk, and no hard disk). For the first time, one had almost instant feedback during compilation and execution. No longer did one have to wait five minutes to five hours for the output of a compilation or execution!

This desktop paradigm, with keyboard-mouse input and screen output, has remained largely unchanged for perhaps ten years ---although machines got faster, disks and memory larger, and program environments more sophisticated. And graphics, not just text, came to be an important part of the human-computer interface.

The latest change was the advent of the world wide web and then Java as a language for writing interactions. Now, even programs written by students in the first programming course can use browsers like netscape, making the assignments much more interesting and enlightening.

In summary, forty years has seen remarkable change in human-computer interaction. How do all these changes come about, and what changes can we expect in the future? I can't answer the second question, but the answer to the first question is easy: the changes are driven by the vision of people like T.V.Raman. Discontented with their current situation, but enthusiastic, creative, and persistent, these visionaries work to make drastic improvements. I wonder what the next forty years of computing will bring.

David Gries William L. Lewis Professor of Engineering Cornell Presidential Weiss Fellow Computer Science Department Cornell University, Ithaca NY.

Book Overview

Contents	Figures	Tables
Preface	Acknowledgements	Index

raman@adobe.com

Last modified: Tue Aug 19 17:08:44 1997