Department of Computer Science Colloquium
Thursday, September 19, 2002, 4:15pm 
Upson Hall B17

Capturing the Content of Computer Science

Michael Kohlhase
Carnegie Mellon University

One of the pertinent tasks of computer science is to supply techniques for structuring data and representing it in a form that supports algorithmic problem solving and added-value services.

It is surprising to note that the field does very little to apply these techniques to its own research and educational materials. We still predominantly use tools like LaTeX for publishing our papers and PowerPoint for presenting the CS theory and practice to our students. In effect, we produce large volumes of data about CS knowledge without turning it into a structured resource. In this talk I will present techniques for content-based markup of CS documents and some of the added-value services supported by these.

Content markup techniques are becoming increasingly popular on the XML-based world wide web, as they add enough structure to allow for automated document processing -- in contrast to presentation markup, which facilitates human document processing -- without inflicting the burden of full formalization of the knowledge contained in the document.

I want to discuss relevant content markup formats like MathML, OpenMath, DocBook, and OMDoc, and extend the latter with the ability for markup of program code (CodeML) to arrive at a full-coverage markup format for CS content.

The talk concludes with a brief overview of the Course Capsules Project at Carnegie Mellon University, where these techniques are employed in computer-supported courseware.


Bio: Dr. Michael Kohlhase is an associate professor at the CS department of Saarland University (Germany) and an adjunct associate professor at the School of Computer Science at Carnegie Mellon University. He studied pure mathematics at the University of Bonn (1989), and wrote his dissertation on higher-order unification and automated theorem proving (1994, Saarland University). Since then, he has taken up research in applying techniques from automated deduction in natural language semantics. His current research interests include automated theorem proving and content-based markup techniques for mathematics and computer science, and natural language processing. He has pursued these interests during extended visits to Carnegie Mellon University, SRI International, and the Universities of Amsterdam and Edinburgh.