New Faculty

  Rich Caruana
  Assistant Professor
  Department of Computer Science
  Ph.D. Carnegie Mellon University,  1997

I work in machine learning and data mining, medical decision making and bioinformatics, feature selection, missing values, inductive transfer, artificial neural networks, memory-based learning. I joined the Department of Computer Science at the start of the fall 2001 semester.


"Predicting Cesarean Delivery: Decision Tree Models." To appear in the American Journal of Obstetrics and Gynecology (2001). With Cynthia J. Sims, Leslie Meyn, Rich Rao, R. Bharat, Tom Mitchell, and Marijane Krohn.

"An Evaluation of Machine Learning Methods for Predicting Pneumonia Mortality." Artificial Intelligence in Medicine 9:107-138 (1997). With G. F. Cooper, C. F. Aliferis, R. Ambrosino, J. Aronis, B. G. Buchanan, M. J. Fine, C. Glymour, G. Gordon, B. H. Hanusa, J. E. Janosky, C. Meek, T. Mitchell, T. Richardson, and P. Spirtes.

"Multitask Learning." Machine Learning 28:41-75, Kluwer Academic Publishers (1997).

"Experience with a Learning Personal Assistant." Communications of the ACM (1994). With Tom Mitchell, Dayne Freitag, John McDermott, and David Zabowski.

"Fifteen Useful Tricks with Extra Outputs." In Neural Networks: Tricks of the Trade, G. B. Orr, and K­R. Muller, editors, Springer-Verlag (1988).


"Using Active Monitor Illumination for 3-D Active Imaging." Patent disclosure filed April, 2000. With Rahul Sukthankar, Keiko Hasegawa, and Matt Mullin.

"Iterated K-nearest neighbor Method and Article of Manufacture for Filling in Missing Values." United States Patent 6,047,287, Assignee: Justsystem Pittsburgh research Center, Pittsburgh, PA. Filed May 5, 1998, granted April 4, 2000.



  K-Y. Daisy Fan
  Assistant Professor
  Computer Science Department
  Ph.D. Cornell University, 2001

My research interests include the application of systems analysis techniques for water resources and environmental problems. I joined the Department of Computer Science at the start of the fall 2001 semester . With Dave Schwartz, I run CS 100 and develop the academic excellence workshops that are associated with that very important course.


Graduate Teaching Assistant Award, Department of Computer Science, Cornell University (2000).

New York State Section American Water Works Association Russell L. Sutphen Scholarship.

John E. Perry Teaching Assistant Prize, School of Environmental Engineering, Cornell University (1999).


"Regression Dynamic Programming for High-dimensional Continuous-state Problems." In preparation for submittal to Operations Research.

"Regression Dynamic Programming for Multiple Reservoir Control." Proceedings of the 2000 ASCE Joint Conference on Water Resources Engineering and Water Resources Planning and Management,
Minneapolis, MN (2001).




  Paul Ginsparg
  FCI, joint with Physics   Ph.D. Cornell, 1981

I received my A.B. degree summa cum laude in Physics from Harvard University in 1977 and my Ph.D. in Physics from Cornell University in 1981 (Quantum Field Theory, thesis advisor: Kenneth G. Wilson), where I was supported as an NSF graduate fellow and A.D.White fellow. I was in the Harvard Society of Fellows from 1981-84, and an assistant professor in the Harvard University Physics department from 1984-90, where I was supported as an A.P. Sloan Fellow and as a DOE Outstanding Junior Investigator. I was a Technical Staff Member in the Los Alamos National Laboratory Theoretical Division from 1990-2001. I have also held visiting positions at C.E.N. Saclay, France, Princeton University, Stanford Linear Accelerator Center, the Institute for Advanced Studies, Princeton, the Institute for Theoretical Physics at UC Santa Barbara, the Mathematical Science Research Institute at UC Berkeley, and Hebrew University, Jerusalem. In 1991 I inititated the "e-print arXiv" as a new form of communications research infrastructure for physics. I have served on many committees and advisory boards, including most recently an NRC/CODATA committee on "Transborder flow of Scientific Data," an NAS/NRC committee on "Future of Universities," an AAAS study committee on "Transition from Paper," and an NSF committee on "Knowledge Networks and Distributed Intelligence Initiative," and I currently serve on the APS global "Task Force on Electronic Information Systems," on the NIH's "PubMed Central National Advisory Committee," on the Open Archives Initiative Steering Committee (and founder), on the French "Centre pour la Communication Scientifique Directe" technical steering committee (and founder), and on the APS "Publications Oversight Committee." I have also given numerous invited keynote talks and colloquia, including recently at meetings "Future of Mathematical Communication" at the University of Minnesota; "Electronic Publishing in Science" at UNESCO HQ in Paris; "50th Anniversary of the Development of the ENIAC computer" at the University of Pennsylvania; "The impact of electronic publishing on the academic community" in Stockholm, Sweden; "Alternative Models for Scholarly Publishing in Higher Education" at UC Berkeley; Chautauqua meeting of the National Computational Science Alliance at the University of KY, Lexington; joint meeting of the Medical Library Association and the Canadian Health Library Association, Vancouver, BC; "The Impact of Barrier-free Access and New Technologies on Biomedical Publishing" at the New York Academy of Medicine; and "Open Archives European Open day," at Max Planck Library in Berlin.


P.A.M. (Physics Astronomy Math) award from the Special Libraries Association. "Honors work which demonstrably improves the exchange of information in physics, math or astronomy" (1998).

Fellow of the American Physical Society. Cited "For his work relating to chiral symmetry on the lattice, for fundamental contributions to string theory, and for establishment and development of the revolutionary 'Los Alamos E-Print Archive'" (November, 2000).


"A Remnant of Chiral Symmetry on the Lattice." Phys. Rev. D25, 2649 (1982). And K. G. Wilson.

"Curiosities at 'c=1'."Nucl. Phys. B295 [FS21], 153 (1988).

"2D Gravity + 1D Matter." Phys. Lett. B240, 333 (1990). And J.Zinn-Justin.

"First Steps towards Electronic Research Communication." Computers and Physics 8 (4):390 (July/August, 1994).

"Winners and Losers in the Global Research Village." In proceedings of 'Electronic Publishing in Science,' Sir R. Elliot and D. Shaw, editors. Held at UNESCO HQ, Paris, ICSU Press (1996).



  Thorsten Joachims
  Assistant Professor
  Department of Computer Science
  Ph.D. Dortmund, 1997

My core scientific interests lie in machine learning and statistical learning theory, with its main applications in the fields of text mining and intelligent information agents. This application domain will be of increasing importance, since human attention and the speed with which we can process information are natural bottlenecks that limit our ability to make informed decisions. While data-base and data-mining techniques can already assist users when analyzing structured data, the problem of text mining, the automatic analysis of text-based information, is still largely unsolved. Therefore, from an application
perspective, my goal is to develop systems and agents that focus, enhance, and accelerate our ability to access large quantities of natural language information. Central to such systems are, for example, tasks like text and speech classification, information extraction, and abstract generation.
      To investigate the fundamental challenges in these tasks, I am particularly interested in machine learning approaches. This is because most tasks dealing with natural-language based information are inherently difficult to formalize and solve manually. In text classification, for example, there is no formal language that lets us describe common categories without reference to human-like background knowledge. Machine learning approaches have shown their ability to overcome the lack
of background knowledge by exploiting statistical regularities of word usage patterns that are related to humanly defined concepts. The success of such learning methods is already reflected in their high commercial demand, and I am convinced that machine learning approaches can further push the limit in
understanding text-based information. Therefore, from a methodological perspective, my goal is to develop and understand machine learning approaches that fit the properties of text-mining problems.


"Knowledge Discovery and Knowledge Validation in Intensive Care." Artificial Intelligence in Medicine
19(3):225-249 (2000). With K. Morik, M. Imhoff, P. Brockhausen and U. Gather.

"Aktuelles Schlagwort: Support Vector Machines." Künstliche Intelligenz 33(4):54-55 (1999).

"Browsing-assistenten, Tour Guides und Adaptive WWW-server." Künstliche Intelligenz 28(3):23-29 (1998). With D. Mladenic.

"Making Large-scale SVM Learning Practical." In Advances in Kernal Methods ­ Support Vector Learning, Schölkopf et al., editors, 11:169-184, MIT Press (1999).

"Text Categorization with Support Vector Machines: Learning with Many Relevant Features." In Proceedings of the European Conference on machine Learning (ECML), Chemnitz, Germany (1998).
Ranked 12 in NEC Research Index most accessed publications (August, 2000).




  Hod Lipson
  Assistant Professor
  FCI, joint with Mechanical and
  Aerospace Engineering
  Ph.D. Technion, 1998

Engineers design by combining knowledge and resources to make products that achieve some functionality. Despite the fact that design is the basis of engineering, this process of problem solving by synthesis is not understood well, and is still taught, to a large extent, as an art. While we have elaborate computational models for analysis, we still have no computational model of synthesis. I believe understanding this process holds the key to future competitiveness, and presents a largely unadressed challenge across both engineering and computer science.
y research interests are in the area of computational design, information systems and fabrication at the intersection of engineering and computer science. I am interested in understanding the synthesis process of design and emulating it computationally, and I focus on the ideas of self-organization and self-replication as new paradigms of design, fabrication and learning. I search for ways to
harness all these areas to make the future CAD/CAM systems. I look both at the human design process and at natural design and fabrication as two sources of inspiration, and I build working systems to test my theories.

Technology Reviews one of ten most promising technologies of the future, 2001.
TIME Magazine's Annual 2001.
Shaping the Future Award, EXPO '2000.
Fischbach Postdoctoral Scholarship, 1998.
CIRP International F. W. Taylor Medal, 1997.
Charles Clore Scholarship Award for Academic Excellence, 1996.
Miriam and Aaron Gutwirth Memorial Fellowship, 1996.


"Automatic Design and Manufacture of Robotic Lifeforms." Nature 406:974-978 (2000). With J.B. Pollack.

"Clustering Irregular Shapes using High Order Neurons." Neural Computation 12(10):2331-2353 (2000). With H. T. Siegelmann.

"Conceptual Design and Analysis by Sketching." Journal of AI Design and Manufacturing 14:391-401 (2000). With M. Shpitalni.

"3D Conceptual Design of Sheet Metal Products by Sketching." Journal of Materials Processing Technology 103(1):128-134 (2000). With M. Shpitalni.

"Identification of Faces in a 2D Line Drawing Projection of a Wireframe Project." IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 18(10):1000-1012 (1996). With M. Shpitalni.


  Jeanna Neefe Matthews
  Assistant Professor
  Department of Computer Science
  Ph.D. Berkeley, 2000

My research interests include file systems, storage systems and more generally operating systems and distributed systems.


Intel Foundation Graduate Fellowship Award, 1999-2000.

Cal VIEW Fellow Award, for excellence in teaching, 1998-1999.

National Science Foundation Graduate Research Fellowship, 1994-1998.

Award Paper, 1995 ACM Symposium on Operating System Principles.


"Improving the Performance of Log-structured File Systems with Adaptive Methods." Proceedings of
the Sixteenth ACM Symposium on Operating Systems Principles, 238-251 (October, 1997). And D.
Roselli, A. Costello, R. Wang, and T. Anderson.

"Serverless Network File Systems." ACM Transactions on Computer Systems (February, 1996). With T. Anderson, M. Dahlin, D. Patterson, D. Roselli, and R. Wang.

"Serverless Network File Systems." Award paper. In Proceedings of the Fifteenth ACM
Symposium on Operating Systems Principles, 109-126 (December, 1995). With T. Anderson, M. Dahlin, D. Patterson, D. Roselli, and R. Wang.

"A Case for Network of Workstations: NOW." IEEE Micro (February, 1995). With T. Anderson, D. Culler, D. Patterson, and the NOW Team.

"An Exploration of Network RAM." UC Berkeley Technical Report, UCB/CSD-98-1000 (December, 1994). With E. Anderson.


  Radu Rugina
  Assistant Professor
  Department of Computer Science
  Ph.D. University of California,
  Santa Barbara, 2001

Program analysis automatically extracts information that is critical for understanding, maintaining and
debugging the program; for checking properties about the program; or for applying various transformations to the program. Although it has traditionally been considered part of the programming languages and compilers community, program analysis has applications in virtually all areas of computer science, and these applications can deliver substantial benefits to scientists in these fields. Research in program analysis is interesting, relevant and important.

In the last few years I developed new techniques for pointer analysis and symbolic analysis of accessed
memory regions. These techniques can analyze general programs, including programs that use recursion, multithreading, and manipulate pointers. I also interacted with researchers in other fields to
apply these techniques to problems in computer architecture, parallel computing, and software
engineering. Concrete results include the automatic parallelization of sophisticated divide and conquer
problems, the static detection of array bounds violations and data races in multithreaded C programs
that heavily use pointers and pointer arithmetic, and the use, by computer architects, of pointer analysis
results to map programs onto the MIT RAW machine and hardware circuits.

In the future, I intend to extend this research to include software reliability and computer security. In
these fields, there is a need to automate the process of checking important properties required to
guarantee the functionality or safety of the program. For large, complex pieces of software, checking
these properties manually, by humans, is a difficult and error-prone task. However, using formal
verification techniques based on program annotations requires programmers to substantially change the way they write software. In this context, program analysis is an appealing alternative. It can be used to develop automatic tools to solve these kinds of problems, but without requiring changes of programming style. I intend to develop deep program analysis techniques and focus on application of these techniques for software reliability and security. I believe that to successfully solve these problems, the analysis techniques have to be general, not restricted only to a certain class of programs, like scientific applications. These program analyses also have to be based on general, formally correct frameworks, not on ad-hoc techniques.


Tuition Fellowship from UC Santa Barbara between 1996 and 2001.
Merit Fellowship from Polytechnica University of Bucharest between 1991 and
Scholarships from the European Union TEMPUS Program for several projects at the Technical University of Eindhoven in Holland: April-June 1992, March-May 1993, and May-July


"Recursion Unrolling for Divide and Conquer Problems." In Proceedings of the13th International Workshop on Languages and Compilers for Parallel Computing, IBM T.J. Watson Research Center,
Yorktown Heights, NY (August, 2000). With Martin Rinard.

"Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions." In Proceedings of the ACM SIGPLAN 1999 Conference on Programming Languages Design
and Implementation, Atlanta, GA (May, 1999). With Martin Rinard.

"Automatic Parallelization of Divide and Conquer Algorithms." In Proceedings of the ACM
SIGPLAN 1999 Conference on Programming Languages Design and Implementation, Atlanta, GA
(May, 1999). With Martin Rinard.

"Predicting the Running Times of Parallel Programs by Simulation."
In Proceedings of the 12th International Parallel Processing Symposium and 9th Symposium
on Parallel and Distributed Processing, Orlando, FL (April, 1998). With Klaus E. Schauser.


  Phoebe Sengers
  Assistant Professor
  FCI, joint with Science and Technology Studies

  Ph.D. Carnegie Mellon University, 1998

I am a computer scientist and a cultural theorist. I build intelligent, interactive, expressive information
systems, like an artificial agent that makes emotionally expressive childlike drawings. The goal of my work is to analyze carefully what our systems currently unconsciously express and to develop technology that allows us to express new aspects of human experience. I use the tools of cultural theory
as a way to understand our systems better and to generate technical ideas for new forms of technology. I am part of a growing community of critical technical practitioners.

I have done research on agents, avatars, virtual environments, and computer graphics at the GMD in
Bonn, Germany. I am active in the Narrative Intelligence research community. Last year, I was a
Fulbright Guest Researcher at the Center for Art and Media Technology (ZKM) in Karlsruhe. In
August 1998, I graduated from Carnegie Mellon University, with a self-defined interdisciplinary Ph.D.
in Artificial Intelligence and Cultural Theory (administered jointly by the Department of Computer Science and the Program in Literary and Cultural Theory).


Lingua Franca, Tech Top 20 (July, 1999), named one of the "top 20 researchers
changing the way we think about technology."
Fulbright Fellowship (September 1998-July, 1999).
AAAI Doctoral Consortium (August,1996).
Office of Naval Research Allen Newell
Graduate Fellowship (October 1994-September 1997).
Member, National Research Council Study for Computers and Creativity.


Narrative Intelligence. Michael Mateas and Phoebe Sengers, editors. Advances in Consciousness Series. Amsterdam: John Benjamins Publishing Company, forthcoming.

"Narrative Intelligence." In Human Cognition and Social Agent Technology. Kerstin Dautenhahn, editor. Advances in Consciousness Series. John Benjamins Publishing Company, 2000.

"Fabrikation der Subjekte: Verdinglichung, Schizophrenie, und Kuenstliche Intelligenz." In Netzkritik: Materialien zur Internet-Debatte. Geert Lovink and Pit Schultz, editors. Berlin: Edition ID-Archiv

"Technological Prostheses: An Anecdote." ZKP-4 Net Criticism Reader. Geert Lovink and Pit Schultz, editors (1997).

"Fabricated Subjects: Reification, Schizophrenia, Artificial Intelligence." ZKP-2 Net Criticism Reader. Geert Lovink and Pit Schultz, editors (1996).

"Practices for Machine Culture: A Case Study of Integrating Artificial Intelligence and Cultural Theory." Surfaces VIII (1999).

"Madness and Automation: On Institutionalization." Postmodern Culture (May, 1995).

"Wallowing in the Quagmire of Language: Artificial Intelligence, Psychiatry, and the Search for the Subject." Cultronix (Summer, 1994).


  Jayavel Shanmugasundaram
  Assistant Professor   
  Computer Science Department
  Ph.D. University of Wisconsin, 2001

My broad research agenda is to build software systems that can serve as the infrastructure for creating and deploying Internet-based business applications (also referred to as e-business applications). The need for building such infrastructure systems arises because currently, in order to build e-business applications, application developers have to program against relatively low-level interfaces and write a lot of special purpose code. This plight of application developers is analogous to that of programmers in the early days of computing, who had to write assembly language programs without the aid of software systems such as compilers and operating systems. My research goal of building software infrastructure systems for e-business applications is thus motivated by the need to provide developers with higher levels of abstraction.


"Accessing Extra-database Information: Concurrency Control and Correctness." Information Systems, An International Journal 23(7):439-462 (1998). With Narain Gehani, Krithi Ramamritham, and Oded

"Efficiently Publishing Relational Data as XML Documents." In Proceedings of the Conference on Very Large Databases (VLDB), Cairo, Egypt (September, 2000). With Eugene Shekita, Rimon Barr, Michael Carey, Bruce Lindsay, Hamid Pirahesh, and Berthold Reinwald.

"Relational Databases for Querying XML documents: Limitations and Opportunities." In Proceedings of the Conference on Very Large Databases, Edinburgh, Scotland (September, 1999). With Kristin Tufte, Gang He, Chun Zhang, David DeWitt, and Jeffrey Naughton.

"Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous
Dimensions." In Proceedings of the ACM SIGKIDD Conference on Knowledge
Discovery and Data Mining, San Diego, CA (August, 1999). With Usama Fayyad, and Paul

"Efficient Concurrency Control for Broadcast Environments." In Proceedings of the ACM SIGMOD Cnference on the Management of Data, Pittsburgh, PA (May, 1999). With Arvind Nithrakashyap, Rajendran Sivasankaran and Krithi Ramamritham.

"Use of Recurrent Neural Networks for Strategic Data Mining of Sales Information."
International Management Resources Association (IMRA) Conference, Hershey, PA (May, 1999).
With Maram V. Nagendra Prasad, Sanjeev Vadhavkar, and Amar Gupta.


"Multi-dimensional Database and Data Cube Compression for Aggregate Query Support on
Numeric Dimensions." United States patent application filed April 22, 1999.

"Using an XML Query Language to Publish Relational Data as XML." United States patent
application filed March 21, 2000