Lillian Lee
Assistant Professor
llee@cs.cornell.edu
http://www.cs.cornell.edu/home/llee/
PhD Harvard, 1997
My primary research focus is on statistical methods for natural language processing, with particular interest in
problems arising from sparse data.
Recent work has investigated the power of similarity-based techniques to improve probability estimation, the
funda- mental technology underlying any statistical approach. I am currently interested in analyzing similarity
functions, both theoretically and empirically; preliminary results
include the |
|
development of a novel family of
information-theoretic functions and a new analysis framework for similarity functions in general. Also, continuing work on both nearest-neighbor techniques and clustering methods, Fernando Pereira and I have been
moving towards an understanding of
the relationships between these two
complementary paradigms.
In other work, Rie Ando and I have
been developing empirical methods for segmenting Japanese, which lacks
space delimiters between words. Our
algorithms rely neither on a dictionary
nor pre-segmented training data, but
rather only on simple statistics drawn from unannotated text. Preliminary
results are very promising: we are
achieving error rates far below those of
morphological analyzers over a variety
of performance metrics.
Honors
-
College of Engineering Teaching Award, 1998-1999
University Activities
- Chair: Computer Science
colloquium series.
-
Member: Field of Cognitive
Studies.
Professional Activities
-
Reviewer: Computer Speech
and Language
- Reviewer: Natural Language
Engineering
- NSF review panel
-
Program Committees: 37th
Annual Meeting of the
Association for Computational
Linguistics (ACL 99)
(reviewer); Fourth Conference
on Empirical Methods in
Natural Language
Processing/Very Large Corpora (EMNLP/VLC '99); ACL-99
Workshop on Unsupervised
Learning in Natural Language
Processing; Student Abstract
and Poster Program, Sixteenth
National Conference on Artificial Intelligence (reviewer)
Lectures
-
Statistical methods in natural
language processing (Four-hour
tutorial). Fifteenth National
Conference on Artificial
Intelligence, Madison,
Wisconsin, 1998 (with J.
Lafferty).
- Unsupervised segmentation of
Japanese. Invited talk. ACL
Workshop on Unsupervised
Learning in Natural Language
Processing, Univ. of Maryland,
1999.
Publications
-
Similarity-based models of
word co-occurrence probabilities. Machine
Learning 34 (1999), 43-69.
Special Issue on Natural
Language Learning (with I.
Dagan and F. Pereira).
-
Measures of distributional
similarity. 37th Annual Meeting
of the Association for
Computational Linguistics
(1999), 25-32.
-
Distributional similarity models:
Clustering vs. nearest neighbors.
37th Annual Meeting of the
Association for Computational Linguistics
(1999), 33-40 (with F. Pereira).
|