Claire Cardie
Professor, Department of Computer Science
          and Department of Information Science
Cornell University
5161 Upson Hall
Office hours (fall 2013): Tuesdays 1-2pm; Fridays 9:00-10:00am.
Research Interests
Teaching
Publications
CV/Resume
My primary research is in the area of natural language processing (NLP) where our goal is to develop algorithms and systems that will vastly improve a user's ability to find, absorb, and extract information from on-line text. My group's research generally proceeds at two complementary levels: we focus both on building real systems for large-scale natural language processing tasks and on developing techniques to address underlying theoretical problems in the syntactic, semantic and pragmatic analysis of natural language. As has become more or less standard in the field, we rely on statistical machine learning techniques as our primary modeling tool, both for guiding natural language system development and for exploring the mechanisms that underlie language understanding. Our current work encompasses a number of related areas:
Hmmm. Need to update this!!!
Estimating the Prevalence of Deception in Online Review Communities. Myle Ott, Claire Cardie, Jeffrey T. Hancock. Proceedings of the 21st International World Wide Web Conference (WWW), 2012.
In Search of a Gold Standard in Studies of Deception. Stephanie Gokhman, Jeff Hancock, Poornima Prabhu, Myle Ott, and Claire Cardie. Proceedings of the EACL 2012 Workshop on Computational Approaches to Deception Detection, 2012.
Multi-aspect Sentiment Analysis with Topic Models. Bin Lu, Myle Ott, Claire Cardie, Benjamin Tsou. Proceedings of the ICDM 2011 Workshop on Sentiment Elicitation from Natural Text for Information Retrieval and Extraction (SENTIRE), 2011.
Automatically Creating General-Purpose Opinion Summaries from Text. Veselin Stoyanov and Claire Cardie. Proceedings of the International Conference Recent Advances in Natural Language Processing 2011 (RANLP), 2011.
Finding Deceptive Opinion Spam by Any Stretch of the Imagination. Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey Hancock. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL HLT), 2011.
Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora. Bin Lu, Chenhao Tan, Claire Cardie, and Benjamin K. Tsou. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL HLT), 2011.
Summarizing Decisions in Spoken Meetings. Lu Wang and Claire Cardie. Proceedings of the ACL Workshop on Automatic Summarization for Different Genres, Media, and Languages, Association for Computational Linguistics (ACL), 2011.
Rulemaking 2.0. Cynthia R. Farina, Mary Newhart, Claire Cardie, and Dan Cosley. University of Miami Law Review, Vol. 65, No. 2., 2011. (Also available as Cornell Legal Studies Research Paper No. 010-010.)
Rulemaking in 140 Characters or Less: Social Networking and Public Participation in Rulemaking . Pace Law Review, 2011. (Also available as Cornell Legal Studies Research Paper No. 010-011.)
Compositional Matrix-Space Models for Sentiment Analysis. Ainur Yessenalina and Claire Cardie. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011.
Multi-level Structured Models for Document Sentiment Classification. Ainur Yessenalina, Yisong Yue, and Claire Cardie. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, 2010.
Coreference Resolution with Reconcile. Veselin Stoyanov, Claire Cardie, Nathan Gilbert, Ellen Riloff, David Butler and David Hysom. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), 2010. (short paper)
Hierarchical Sequential Learning for Extracting Opinions and Their Attributes. Yejin Choi and Claire Cardie. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), 2010. (short paper)
Automatically Generating Annotator Rationales to Improve Sentiment Classification. Ainur Yessenalina, Yejin Choi, and Claire Cardie. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), 2010. (short paper)
Reconcile: A Coreference Resolution Research Platform. Veselin Stoyanov, Claire Cardie, Nathan Gilbert, Ellen Riloff, David Butler, David Hysom. Cornell University Technical Report, http://hdl.handle.net/1813/14919, 2010.
Adapting a Polarity Lexicon Using Integer Linear Programming for Domain-Specific Sentiment Classification. Yejin Choi and Claire Cardie. Empirical Methods in Natural Language Processing (EMNLP), 2009.
Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-Art. Veselin Stoyanov, Nathan Gilbert, Claire Cardie and Ellen Riloff. Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), 2009.
Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis. Yejin Choi and Claire Cardie. Empirical Methods in Natural Language Processing (EMNLP), 2008.
Topic Identification for Fine-Grained Opinion Analysis. Veselin Stoyanov and Claire Cardie. Proceedings of the Conference on Computational Linguistics (COLING 2008), 2008.
Guest Editors' Introduction: Text Annotation for Political Science Research. Claire Cardie and John Wilkerson. Journal of Information Technology & Politics, 5:1, 2008.
The power of negative thinking: Exploiting Label Disagreement in the Min-cut Classification Framework. Mohit Bansal and Claire Cardie and Lillian Lee. Proceedings of the Conference on Computational Linguistics (COLING 2008): Companion volume: Posters, 2008.
Annotating Topics of Opinions. Veselin Stoyanov and Claire Cardie. Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, 2008.
An eRulemaking Corpus: Identifying Substantive Issues in Public Comments. Claire Cardie, Cynthia Farina, Matt Rawding, Adil Aijaz. Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, 2008.
A Study in Rule-Specific Issue Categorization for e-Rulemaking. Claire Cardie, Cynthia Farina, Adil Aijaz, Matt Rawding, Stephen Purpura. 9th Annual International Conference on Digital Government Research, Montreal, Canada, 2008.
Active Learning for e-Rulemaking: Public Comment Categorization. Stephen Purpura, Claire Cardie, Jesse Simons. 9th Annual International Conference on Digital Government Research, Montreal, Canada, 2008.
Structured Local Training and Biased Potential Functions for Conditional Random Fields with Application to Coreference Resolution. Yejin Choi and Claire Cardie. NAACL Human Language Technology Conference (NAACL-HLT), 2007.
Identifying Expressions of Opinion in Context. Eric Breck, Yejin Choi, and Claire Cardie. Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 2007.
Cornell System Description for the NTCIR-6 Opinion Task. Eric Breck, Yejin Choi, Veselin Stoyanov, and Claire Cardie. The 6th NTCIR Workshop Meeting, Tokyo, Japan, 2007.
Joint Extraction of Entities and Relations for Opinion Recognition. Yejin Choi, Eric Breck, and Claire Cardie. Proceedings of Empirical Methods in Natural Language Processing (EMNLP), 2006.
Partially Supervised Coreference Resolution for Opinion Summarization through Structured Rule Learning. Veselin Stoyanov and Claire Cardie. Proceedings of Empirical Methods in Natural Language Processing (EMNLP), 2006.
Toward Opinion Summarization: Linking the Sources. Veselin Stoyanov and Claire Cardie. COLING-ACL 2006 Workshop on Sentiment and Subjectivity in Text, 2006.
Using Natural Language Processing to Improve E-rulemaking. Claire Cardie, Cynthia Farina, Thomas Bruce, and Erica Wagner. Proceedings of the 7th Annual International Conference on Digital Government Research, 2006.
Better Inputs for Better Outcomes: Using the Interface to Improve e-Rulemaking. Cynthia Farina, Claire Cardie, Thomas Bruce, Erica Wagner. Workshop on eRulemaking at the Crossroads, Proceedings of the 7th Annual International Conference on Digital Government Research, 2006.
Annotating Expressions of Opinions and Emotions in Language. Janyce Wiebe, Theresa Wilson, Claire Cardie. Language Resources and Evaluation (formerly Computers and the Humanities), 39:2-3, 2005.
Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns. Yejin Choi, Claire Cardie, Ellen Riloff, and Siddharth Patwardhan. Proceedings of HLT-EMNLP 2005, 2005.
Multi-Perspective Question Answering Using the OpQA Corpus. Ves Stoyanov, Claire Cardie and Janyce Wiebe. Proceedings of HLT-EMNLP 2005, 2005.
Optimizing to Arbitrary NLP Metrics using Ensemble Selection. Art Munson, Claire Cardie, and Rich Caruana. Proceedings of HLT-EMNLP 2005, 2005.
OpinionFinder: A System for Subjectivity Analysis. Theresa Wilson, Paul Hoffmann, Swapna Somasundaran, Jason Kessler, Janyce Wiebe; Yejin Choi, Claire Cardie; Ellen Riloff and Siddharth Patwardhan. Proceedings of HLT/EMNLP 2005 Interactive Demonstrations, 2005. (demo)
Evaluating an Opinion Annotation Scheme Using a New Multi-Perspective Question and Answer Corpus. Claire Cardie, Janyce Wiebe, and Diane Litman. In Computing Attitude and Afftect in Text: Theory and Practice. Shanahan, Qu, and Wiebe (eds.), Springer, 2005. Originally appeared in 2004 AAAI Spring Symposium on Exploring Attitude and Affect in Text, AAAI Press, 2004. Answer annotation instructions; Question creation instructions.
Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions. Eric Breck and Claire Cardie. 20th International Conference on Computational Linguistics (COLING-04), 2004.
Low-Level Annotations and Summary Representations of Opinions for Multiperspective QA. Claire Cardie, Janyce Wiebe, Theresa Wilson, & Diane Litman. In Mark Maybury (ed), New Directions in Question Answering , AAAI Press/MIT Press, 2004. (Originally apeared at the 2003 AAAI Spring Symposium on New Directions in Question Answering.)
Weakly Supervised Natural Language LearningWithout Redundant Views. Vincent Ng and Claire Cardie. Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2003), 173-180, Association for< Computational Linguistics, 2003.
Bootstrapping Coreference Classifiers with Multiple Machine Learning Algorithms. Vincent Ng and Claire Cardie. Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP-2003), Association for Computational Linguistics, 2003.
Recognizing and Organizing Opinions Expressed in theWorld Press. JanyceWiebe, Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson, David Day, Mark Maybury. 2003 AAAI Spring Symposium on New Directions in Question Answering, 12-19, AAAI Press, 2003.
NRRC SummerWorkshop on Multiple-Perspective Question Answering: Final Report. JanyceWiebe, Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, TheresaWilson. 2002.
Improving Machine Learning Approaches to Coreference Resolution. Vincent Ng and Claire Cardie. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2002.
Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution. Vincent Ng and Claire Cardie. Proceedings of the 19th International Conference on Computational Linguistics (COLING-2002), 2002.
Combining Sample Selection and Error-Driven Pruning for Machine Learning of Coreference Rules. Vincent Ng and Claire Cardie. Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2002.
Detecting Discrepancies in Numerical Estimates Using Multidocument Hypertext Summaries. Michael White, Claire Cardie, Vincent Ng, and Daryl McCullough. Proceedings of the Second International Conference on Human Language Technology Research (HLT-02), 2002.
Selecting Sentences for Multidocument Summaries Using Randomized Local Search. Michael White and Claire Cardie. ACL Workshop on Automatic Summarization, 2002.
Limitations of Co-Training for Natural Language Learning from Large Datasets. David Pierce and Claire Cardie. Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP-2001), Association for Computational Linguistics Research,2001.
Constrained K-means Clustering with Background Knowledge. Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schroedl. Proceedings of the Eighteenth International Conference on Machine Learning, Morgan Kaufmann, 2001.
Multi-document Summarization via Information Extraction. Michael White, Tanya Korelsky; Claire Cardie, Vincent Ng, David Pierce, and Kiri Wagstaff. Proceedings of the First International Conference on Human Language Technology Research (HLT-01), 2001.
Detecting Discrepancies and Improving Intelligibility: Two Preliminary Evaluations of RIPTIDES. Michael White, Claire Cardie, Vincent Ng, Kiri Wagstaff, and Daryl McCullough. 2001 Document Understanding Conference (DUC-01), 2001.
User-Oriented Machine Learning Strategies for Information Extraction: Putting the Human Back in the Loop. David Pierce and Claire Cardie. Working Notes of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining, pages 80-81, 2001.
Using Clustering and SuperConcepts within SMART: TREC 6. C. Buckley,M. Mitra, J.Walz, and C. Cardie. Information Processing and Management, 36(1), 109-131, 2000.
Examining the Role of Statistical and Linguistic Knowledge Sources in a General-Knowledge Question-Answering System. C. Cardie, V. Ng, D. Pierce, and C. Buckley. Proceedings of the Sixth Applied Natural Language Processing Conference (ANLP-2000), 180-187, Association for Computational Linguistics / Morgan Kaufmann, 2000.
Towards Translingual Information Access Using Portable Information Extraction. M. White, C. Cardie, C. Han, N. Kim, B. Lavoie, M. Palmer, O. Rambow, J. Yoon. Proceedings of the ANLP/NAACL Workshop on Embedded Machine Translation Systems, 31-37, 2000.
Integrating Case-Based Learning and Cognitive Biases for Machine Learning of Natural Language. C. Cardie. Journal of Experimental and Theoretical Artificial Intelligence, 11, 297-337, 1999.
The Role of Lexicalization and Pruning for Base Noun Phrase Grammars. C. Cardie and D. Pierce. Proceedings of the Sixteenth National Conference on Artificial Intelligence, 423-430, AAAI Press, 1999.
The Smart/Empire TIPSTER IR System. Chris Buckley, Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff, and Janet Walz. TIPSTER Phase III Proceedings, 107-121, Morgan Kaufmann, 1999.
SMART High Precision: TREC 7. Chris Buckley, Mandar Mitra, Janet Walz, and Claire Cardie. Proceedings of the Seventh Text REtrieval Conference (TREC-7), NIST Special Publication 500-242, 285-298, 1998.
Guest Editors' Introduction: Machine Learning and Natural Language. C. Cardie and R. Mooney. Machine Learning, 11:(1-3), 1-5, 1999.
Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification. C. Cardie and D. Pierce. ACL/Coling-98, 218-224. Association for Computational Linguistics, 1998.
Using Clustering and SuperConcepts within SMART: TREC 6. C. Buckley, M. Mitra, J. Walz, and C. Cardie. Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500-240, 107-124, 1998.
Proposal for an Interactive Environment for Information Extraction. C. Cardie and D. Pierce. Cornell CS Technical Report TR98-1702, 1998.
Empirical Methods in Information Extraction. C. Cardie. AI Magazine, 18:4, 65-79 1997. [Note that this is the version of the paper BEFORE it was formatted for AI Magazine by their editors.]
Improving Minority Class Prediction Using Case-Specific Feature Weights. C. Cardie and N. Howe. Proceedings of the Fourteenth International Conference on Machine Learning, D. Fisher, editor, Morgan Kaufmann, 57-65, 1997.
Examining Locally Varying Weights for Nearest Neighbor Algorithms. N. Howe and C. Cardie. Case-Based Reasoning Research and Development: Second International Conference on Case-Based Reasoning, D. Leake and E. Plaza, eds., Lecture Notes in Aritificial Intelligence, Springer, 455-466, 1997.
An Analysis of Statistical and Syntactic Phrases. M. Mitra, C. Buckley, A. Singhal, and C. Cardie. 5TH RIAO Conference, Computer-Assisted Information Searching On the Internet, 200-214, 1997.
Proposal for a Framework for the High-Precision Identification of Linguistic Relationships. C. Cardie and S. Mardis. Cornell CS Technical Report TR97-1653, 1997.
Automating Feature Set Selection for Case-Based Learning of Linguistic Knowledge. C. Cardie. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 113-126, University of Pennsylvania, 1996.
Embedded Machine Learning Systems for Natural Language Processing: A General Framework. C. Cardie. In Wermter, S. and Riloff, E. and Scheler, Gabriele (eds.), Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, Lecture Notes in Artificial Intelligence, 315-328, Springer, 1996. Originally presented at the Workshop on New Approaches to Learning for Natural Language Processing, 14th International Joint Conference on Artificial Intelligence (IJCAI-95), 119-126, 1995. AAAI Press.
Domain-Specific Knowledge Acquisition for Conceptual Sentence Analysis. C. Cardie. Ph.D. Thesis, University of Massachusetts, Amherst, MA, 1994. Available as University of Massachusetts, CMPSCI Technical Report 94-74. (178 pages, compressed postscript)
A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis. C. Cardie. Proceedings of the Eleventh National Conference on Artificial Intelligence, 798-803, Washington, DC, 1993. AAAI Press / MIT Press.
Using Decision Trees to Improve Case-Based Learning. C. Cardie. Proceedings of the Tenth International Conference on Machine Learning, 25-32, Amherst, MA, 1993. Morgan Kaufmann.
Corpus-Based Acquisition of Relative Pronoun Disambiguation Heuristics. C. Cardie. Proceedings of the 30th Annual Conference of the Association for Computational Linguistics, 216-223, Newark, DE, 1992. Association for Computational Linguistics.
Learning to Disambiguate Relative Pronouns. C. Cardie. Proceedings of the Tenth National Conference on Artificial Intelligence, 38-43, San Jose, CA, 1992. AAAI Press / MIT Press.
Using Cognitive Biases to Guide Feature Set Selection. C. Cardie. Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society, 743-748, Bloomington, IN, Lawrence Erlbaum Associates, and Working Notes of the AAAI Workshop on Constraining Learning with Prior Knowledge, 11-18, San Jose, CA, 1992.
A Cognitively Plausible Approach to Understanding Complicated Syntax. C. Cardie and W. Lehnert. Proceedings of the Ninth National Conference on Artificial Intelligence, 117-124, Anaheim, CA, 1991. AAAI Press / MIT Press.
Analyzing Research Papers Using Citation Sentences. W. Lehnert, C. Cardie, and E. Riloff. Proceedings of the Twelfth Annual Conference of the Cognitive Science Society, 511-518, Cambridge, MA, 1990. Lawrence Erlbaum Associates.