Xilun Chen

I am a final year CS PhD student at Cornell University working on Natural Language Processing. My advisor is Prof. Claire Cardie. Before that, I did my undergraduate study at Shanghai Jiao Tong University.

Cornell University
349 Gates Hall
Ithaca, NY 14853

Email: xlchen@nospam.cs.cornell.edu


I am broadly interested in Natural Language Processing and Machine Learning methods that can be used to automatically learn the underlying knowledge in languages.

On the methodology side, I am intrigued by elegant models with end-to-end training, such as the capability of Deep Neural Networks to learn powerful representations without explicitly injecting human knowledge. I am also interested in structured prediction models that can handle complex output spaces since NLP problems usually have structured outputs.

On the application side, I am very interested in extracting and understanding knowledge behind languages, such as representation learning, information extraction, reading comprehension, etc.

Recently, my proposed thesis research focuses on learning deep representations for low-resource / zero-resource cross-lingual model transfer.


Manuscripts & Preprints

Zero-Resource Multilingual Model Transfer: Learning What to Share
Xilun Chen, Ahmed Hassan Awadallah, Hany Hassan, Wei Wang and Claire Cardie
(In submission)

Journal Articles

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie and Kilian Weinberger
(To Appear) Transactions of the Association for Computational Linguistics (TACL).
arXiv, bibtex, code

Conference Proceedings

Unsupervised Multilingual Word Embeddings
Xilun Chen, Claire Cardie
(To Appear) 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018)
arXiv, code

Multinomial Adversarial Networks for Multi-Domain Text Classification
Xilun Chen, Claire Cardie
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018)
proceedings, bibtex, arXiv, code

A Rectangle Mining Method for Understanding the Semantics of Financial Tables
Xilun Chen, Laura Chiticariu, Marina Danilevsky, Alexandre Evfimievski and Prithviraj Sen
The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR 2017)
proceedings, pdf, poster, bibtex, dataset

Combining Global Models for Parsing Universal Dependencies
Tianze Shi, Felix G. Wu, Xilun Chen and Yao Cheng
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (CoNLL 2017)
pdf, bibtex

Price of Anarchy of Innovation Diffusion in Social Networks
Xilun Chen and Chenxia Wu, WINE 2014 (Poster)

Multi-Domain Adaptation for SMT Using Multi-Task Learning
Lei Cui, Xilun Chen, Dongdong Zhang, Shujie Liu,m Mu Li and Ming Zhou
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)
pdf, bibtex


05.2018 - 08.2018, Microsoft Research, Redmond, WA.
Research Intern with the Language and Information Technology team
Worked on zero-resource cross-lingual model transfer (pdf).

05.2017 - 08.2017, Facebook, Menlo Park, CA.
PhD Intern working with the Search NLP team
Worked on Aspect Extraction on review data. E.g. In review I like the food but the service is terrible., the system should extract food as a positive aspect while extracting service as a negative one.

05.2016 - 08.2016, IBM Research, San Jose, CA.
Research Intern with the Scalable NLP team
Worked on understanding the structure and semantics of tabular data in financial documents. Our paper was published on ICDAR 2017.

05.2015 - 08.2015, Google, Mountain View, CA.
PhD Intern with the Display Ads Predictive Targeting team
Implemented a new targeting model based on Slope One collaborative filtering to recommend keywords to Ads for better coverage using only keyword performance statistics.

05.2012 - 02.2013, Microsoft Research Asia, Beijing, China.
Full-time Intern working on Machine Translation with Dr. Dongdong Zhang and Dr. Ming Zhou Designed an all-in-one GUI Auto Trainer for the Machine Translation pipeline of MSRA. Participated in a real-time speech-to-speech machine translation task (News, Video)


TA for CS4700: Principles of Artificial Intelligence
TA for CS4300: Language and Information
TA for CS4740/5740: Introduction to Natural Language Processing
TA for CS2110: Object-Oriented Programming and Data Structures

Professional Services

EMNLP 2018 (PC member: Long Papers)
RepL4NLP @ ACL 2018 (PC member)
W-NUT @ EMNLP 2018 (PC member)


I cook on a daily basis when I’m stuck in Ithaca where good Chinese food is scarce.

I do archery (mostly indoor Olympic-style recurve archery) in my spare time, and sometimes go skiing when weather permits.