| SUNDAY, JUNE 3 |
| 8:45 |
9:00 |
Welcome |
|
| Session
I: Learning |
| 9:00 |
9:25 |
Limitations of Co-Training for Natural Language Learning from Large Datasets |
| David Pierce and Claire Cardie |
|
| 9:25 |
9:50 |
A Sequential Model for Multi-Class Classification |
| Yair Even-Zohar and Dan Roth |
|
| 9:50 |
10:15 |
Learning Within-Sentence Semantic Coherence |
| Elena Eneva, Rose Hoberman and Lucian Lita |
| 10:15 |
10:45 |
Break |
| Session
II: Machine Translation |
| 10:45 |
11:10 |
Knowledge Sources for Word-Level Translation Models |
| Philipp Köhn and Kevin Knight |
| 11:10 |
11:35 |
Improving Lexical Mapping Model of English-Korean Bitext Using Structural Features |
| Seonho Kim, Juntae Yoon and Mansuk Song |
| 11:35 |
11:45 |
Short break |
| 11:45 |
12:45 |
Invited talk, Eric Brill "Paucity Shmaucity -- What Can We Do With A Trillion Words?" |
| 12:45 |
2:00 |
LUNCH |
|
| Session
III: Text Categorization |
| 2:00 |
2:25 |
Stacking Classifiers for Anti-Spam Filtering of E-Mail |
| Georgios Sakkis, Ion Androutsopoulos, Georgios Paliouras, Vangelis Karkaletsis, Constantine D. Spyropoulos and Panagiotis Stamatopoulos |
|
| 2:25 |
2:50 |
Feature Space Restructuring for SVMs with Application to Text Categorization |
| Hiroya Takamura and Yuji Matsumoto |
|
| 2:50 |
3:15 |
Using Bins to Empirically Estimate Term Weights for Text Categorization |
| Carl Sable and Kenneth W. Church |
| 3:15 |
3:25 |
Short break |
|
| Session
IV: Question Answering and Information Extraction |
| 3:25 |
3:50 |
Question Answering Using a Large Text Database: A Machine Learning Approach |
| Hwee Tou Ng, Jennifer Lai Pheng Kwan and Yiyuan Xia |
|
| 3:50 |
4:15 |
Information Extraction Using the Structured Language Model |
| Ciprian Chelba and Milind Mahajan |
| 4:15 |
4:30 |
Short break |
| 4:30 |
5:30 |
Panel: When does EM work? (includes an introduction to the Expectation-Maximization algorithm) Eugene Charniak, Kevin Knight, Ted Pedersen, Stefan Riezler |
| MONDAY, JUNE 4 |
|
| Session
V: Lexical Acquisition and Text Segmentation |
| 8:35 |
9:00 |
Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy |
| Barbara Rosario and Marti Hearst |
|
| 9:00 |
9:25 |
The Unknown Word Problem: a Morphological Analysis of Japanese Using Maximum Entropy Aided by a Dictionary |
| Kiyotaka Uchimoto, Satoshi Sekine and Hitoshi Isahara |
|
| 9:25 |
9:50 |
Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? |
| Patrick Schone and Daniel Jurafsky |
|
| 9:50 |
10:15 |
Latent Semantic Analysis for Text Segmentation |
| Freddy Y. Y. Choi, Peter Wiemer-Hastings and Johanna Moore |
| 10:15 |
10:45 |
Break |
|
| Session
VI: Applications |
| 10:45 |
11:10 |
Detecting Short Passages of Similar Text in Large Document Collections |
| Caroline Lyon, James Malcolm and Bob Dickerson |
|
| 11:10 |
11:35 |
Hybrid Text Mining for Finding Abbreviations and their Definitions |
| Youngja Park and Roy J. Byrd |
| 11:35 |
11:45 |
Short break |
| 11:45 |
12:45 |
Panel: What Works and What Doesn't?
Industrial Perspectives Adam Berger, David Evans, Joshua Goodman, Lynette Hirschman |
| 12:45 |
2:00 |
LUNCH |
|
| Session
VII: Spoken Language Output |
| 2:00 |
2:25 |
Automatic Corpus-based Tone Prediction using K-ToBI Representation |
| Jin-Seok Lee, Byeongchang Kim and Gary Geunbae Lee |
|
| 2:25 |
2:50 |
Probabilistic Context-Free Grammars for Syllabification and Grapheme-to-Phoneme Conversion |
| Karin Müller |
| 2:50 |
3:00 |
Short break |
|
| Session
VIII: POS Tagging and Corpus Analysis |
| 3:00 |
3:25 |
Comparing Data-Driven Learning Algorithms for PoS Tagging of Swedish |
| Beáta Megyesi |
|
| 3:25 |
3:50 |
Impact of Quality and Quantity of Corpora on Stochastic Generation |
| Srinivas Bangalore, John Chen and Owen Rambow |
|
| 3:50 |
4:15 |
Corpus Variation and Parser Performance |
| Daniel Gildea |
| 4:15 |
4:30 |
Refreshments (close) |
Note: the formatting of this page comes from the version generated for
the NAACL CDROM.