| SUNDAY, JUNE 3 | 
| 8:45 | 
9:00 | 
Welcome | 
|
 | Session
I: Learning | 
| 9:00 | 
9:25 | 
Limitations of Co-Training for Natural Language Learning from Large Datasets | 
| David Pierce and Claire Cardie | 
|
| 9:25 | 
9:50 | 
A Sequential Model for Multi-Class Classification | 
| Yair Even-Zohar and Dan Roth | 
|
| 9:50 | 
10:15 | 
Learning Within-Sentence Semantic Coherence | 
| Elena Eneva, Rose Hoberman and Lucian Lita | 
| 10:15 | 
10:45 | 
Break | 
 | Session
II: Machine Translation | 
| 10:45 | 
11:10 | 
Knowledge Sources for Word-Level Translation Models | 
| Philipp Köhn and Kevin Knight | 
| 11:10 | 
11:35 | 
Improving Lexical Mapping Model of English-Korean Bitext Using Structural Features | 
| Seonho Kim, Juntae Yoon and Mansuk Song | 
| 11:35 | 
11:45 | 
Short break | 
| 11:45 | 
12:45 | 
Invited talk,  Eric Brill "Paucity Shmaucity -- What Can We Do With A Trillion Words?" | 
| 12:45 | 
2:00 | 
LUNCH | 
|
 | Session
III: Text Categorization | 
| 2:00 | 
2:25 | 
Stacking Classifiers for Anti-Spam Filtering of E-Mail | 
| Georgios Sakkis, Ion Androutsopoulos, Georgios Paliouras, Vangelis Karkaletsis, Constantine D. Spyropoulos and Panagiotis Stamatopoulos | 
|
| 2:25 | 
2:50 | 
Feature Space Restructuring for SVMs with Application to Text Categorization | 
| Hiroya Takamura and Yuji Matsumoto | 
|
| 2:50 | 
3:15 | 
Using Bins to Empirically Estimate Term Weights for Text Categorization | 
| Carl Sable and Kenneth W. Church | 
| 3:15 | 
3:25 | 
Short break | 
|
 | Session
IV: Question Answering and Information Extraction | 
| 3:25 | 
3:50 | 
Question Answering Using a Large Text Database: A Machine Learning Approach | 
| Hwee Tou Ng, Jennifer Lai Pheng Kwan and Yiyuan Xia | 
|
| 3:50 | 
4:15 | 
Information Extraction Using the Structured Language Model | 
| Ciprian Chelba and Milind Mahajan | 
| 4:15 | 
4:30 | 
Short break | 
| 4:30 | 
5:30 | 
Panel: When does EM work? (includes an introduction to the Expectation-Maximization algorithm) Eugene Charniak, Kevin Knight, Ted Pedersen, Stefan Riezler | 
 | MONDAY, JUNE 4 | 
|
 | Session
V: Lexical Acquisition and Text Segmentation | 
| 8:35 | 
9:00 | 
Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy | 
| Barbara Rosario and Marti Hearst | 
|
| 9:00 | 
9:25 | 
The Unknown Word Problem: a Morphological Analysis of Japanese Using Maximum Entropy Aided by a Dictionary | 
| Kiyotaka Uchimoto, Satoshi Sekine and Hitoshi Isahara | 
|
| 9:25 | 
9:50 | 
Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? | 
| Patrick Schone and Daniel Jurafsky | 
|
| 9:50 | 
10:15 | 
Latent Semantic Analysis for Text Segmentation | 
| Freddy Y. Y. Choi, Peter Wiemer-Hastings and Johanna Moore | 
| 10:15 | 
10:45 | 
Break | 
|
 | Session
VI: Applications | 
| 10:45 | 
11:10 | 
Detecting Short Passages of Similar Text in Large Document Collections | 
| Caroline Lyon, James Malcolm and Bob Dickerson | 
|
| 11:10 | 
11:35 | 
Hybrid Text Mining for Finding Abbreviations and their Definitions | 
| Youngja Park and Roy J. Byrd | 
| 11:35 | 
11:45 | 
Short break | 
| 11:45 | 
12:45 | 
Panel: What Works and What Doesn't?
Industrial Perspectives  Adam Berger, David Evans, Joshua Goodman, Lynette Hirschman | 
| 12:45 | 
2:00 | 
LUNCH | 
|
 | Session
VII: Spoken Language Output | 
| 2:00 | 
2:25 | 
Automatic Corpus-based Tone Prediction using K-ToBI Representation | 
| Jin-Seok Lee, Byeongchang Kim and Gary Geunbae Lee | 
|
| 2:25 | 
2:50 | 
Probabilistic Context-Free Grammars for Syllabification and Grapheme-to-Phoneme Conversion | 
| Karin Müller | 
| 2:50 | 
3:00 | 
Short break | 
|
 | Session
VIII: POS Tagging and Corpus Analysis | 
| 3:00 | 
3:25 | 
Comparing Data-Driven Learning Algorithms for PoS Tagging of Swedish | 
| Beáta Megyesi | 
|
| 3:25 | 
3:50 | 
Impact of Quality and Quantity of Corpora on Stochastic Generation | 
| Srinivas Bangalore, John Chen and Owen Rambow | 
|
| 3:50 | 
4:15 | 
Corpus Variation and Parser Performance | 
| Daniel Gildea | 
| 4:15 | 
4:30 | 
Refreshments (close) | 
Note: the formatting of this page comes from the version generated for
the NAACL CDROM.