Claire Cardie

Claire Cardie Assistant Professor
cardie@cs.cornell.edu
http://www.cs.cornell.edu/home/cardie/cardie.html
Ph.D. University of Massachusetts, Amherst, 1994

My research focuses primarily on corpus-based approaches for understanding and extracting information from natural language texts, but it spans a number of areas including machine learning, case-based reasoning, and knowledge acquisition. Although current natural language processing (NLP) systems cannot yet perform in-depth text understanding, they can read an arbitrary text and summarize its major events provided those events fall within a particular domain of interest (e.g. stories about natural disasters or terrorist events). To understand the texts, NLP systems rely heavily on handcrafted linguistic knowledge as well as handcrafted knowledge about the domain and about the world in general. Unfortunately, encoding this background knowledge into the system is difficult, time-consuming, and error prone, and it invariably requires the expertise of computational linguists familiar with the underlying system.

To avoid these difficulties, we have developed a general knowledge acquisition frame-work, Kenmore, in which natural language processing systems can begin to bootstrap their own knowledge bases directly from the text. The framework, which combines robust partial parsing and machine learning techniques, essentially allows the NLP system to learn the knowledge it needs to process a text. Thus far, Kenmore has been used with corpora from two real-world domains for part of speech tagging, word-sense tagging, concept activation, and relative pronoun resolution.

We continue to investigate the use of machine learning techniques as tools for guiding natural language system development and for exploring the mechanisms that underlie language acquisition. This work includes: (1) extending Kenmore to handle additional knowledge acquisition tasks for NLP, e.g. pronoun resolution; (2) extending Kenmore to handle the task of extracting entire knowledge bases, e.g. a rule base, directly from text; and (3) improving the performance of the system by allowing linguistic and cognitive biases to influence our corpus-based approach to learning linguistic knowledge.

Awards

National Science Foundation CAREER Award
Lilly Teaching Fellow
College of Engineering Teaching Award

University Activities

Member: Computer Science Graduate Admissions Committee; Cognitive Studies Undergraduate Committee; Engineering College Committee on Faculty Development and Mentoring
Reviewer: Undergraduate Minority/Under-represented Summer Research Exchange Program; Cognitive Studies Summer Fellowships; Cognitive Studies Continuing Fellowships Cognitive Studies Undergraduate Committee

Professional Activities

NSF Review Panel
Program committee: 34th Annual Meeting of the Association for Computational Linguistics; Thirteenth National Conference on Artificial Intelligence; Second International Colloquium on Grammatical Inference; Conference on Empirical Methods in Natural Language Processing; International Conference on New Methods in Natural Language Processing; Twelfth International Conference on Machine Learning
Reviewer: Computational Linguistics; Journal of Artificial Intelligence Research; Neural Information Processing Conference

Lectures

The use of cognitive biases in case-based learning of linguistic knowledge. Invited Workshop on Computational Models of Human Syntactic Processing, Netherlands Institute for Advanced Study, Wassenaar, The Netherlands, June 1996.
Automating feature set selection for case-based learning of linguistic knowledge. University of Pennsylvania, Philadelphia, PA, May 1996.
Lexical knowledge acquisition using machine learning techniques. SUNY Binghamton Colloquium Series, Binghamton, NY, September 1995.

Publications

Automating feature set selection for case-based learning of linguistic knowledge. Proceedings of the Conference on Empirical Methods in Natural Language Processing, University of Pennsylvania, PA, 113-126, May 1996.
Embedded machine learning systems for natural language processing: A general framework. Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, S. Wermter, E. Riloff, and G. Scheler, eds., Springer-Verlag, Berlin, 315-328, 1996.
____. Workshop on New Approaches to Learning for Natural Language Processing, 14th International Joint Conference on Artificial Intelligence, AAAI Press, 119-126, 1995. Evaluating an information extraction system. Journal of Integrated Computer-Aided Engineering 1,6 (1994) 453-472 (with W. Lehnert, D. Fisher, J. McCarthy, E. Riloff, and S. Soderland).

Personal

Return to: