Research

I am broadly interested in Natural Language Processing and Machine Learning methods that can be used to automatically learn the underlying knowledge in languages.

On the methodology side, I am intrigued by elegant models with end-to-end training, such as the capability of Deep Neural Networks to learn powerful representations without explicitly injecting human knowledge. I am also interested in structured prediction models that can handle complex output spaces since NLP problems usually have structured outputs.

On the application side, I am very interested in extracting and understanding knowledge behind languages, such as representation learning, information extraction, reading comprehension, etc.

Recently, my proposed thesis research focuses on learning deep representations for low-resource / zero-resource cross-lingual model transfer.

Publications

PhD Dissertation

Learning Deep Representations for Low-Resource Cross-Lingual Natural Language Processing
Xilun Chen
May, 2019
pdf

Journal Articles

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Xilun Chen, Yu Sun, Ben Athiwaratkun, Claire Cardie and Kilian Weinberger
Transactions of the Association for Computational Linguistics (TACL).
Article, bibtex (TACL), arXiv, bibtex (arXiv), talk@EMNLP2018, code

Conference Proceedings

Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Xilun Chen, Ahmed Hassan Awadallah, Hany Hassan, Wei Wang and Claire Cardie
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)
proceedings, bibtex, arXiv, code

Unsupervised Multilingual Word Embeddings
Xilun Chen, Claire Cardie
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018)
proceedings, bibtex, arXiv, poster, code

Multinomial Adversarial Networks for Multi-Domain Text Classification
Xilun Chen, Claire Cardie
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018)
proceedings, bibtex, arXiv, poster, code

A Rectangle Mining Method for Understanding the Semantics of Financial Tables
Xilun Chen, Laura Chiticariu, Marina Danilevsky, Alexandre Evfimievski and Prithviraj Sen
The 14th IAPR International Conference on Document Analysis and Recognition (ICDAR 2017)
proceedings, pdf, poster, bibtex, dataset

Combining Global Models for Parsing Universal Dependencies
Tianze Shi, Felix G. Wu, Xilun Chen and Yao Cheng
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (CoNLL 2017)
pdf, bibtex

Price of Anarchy of Innovation Diffusion in Social Networks
Xilun Chen and Chenxia Wu, WINE 2014 (Poster)
pdf

Multi-Domain Adaptation for SMT Using Multi-Task Learning
Lei Cui, Xilun Chen, Dongdong Zhang, Shujie Liu,m Mu Li and Ming Zhou
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)
pdf, bibtex

Experiences

05.2018 - 08.2018, Microsoft Research, Redmond, WA.
Research Intern with the Language and Information Technology team
Proposed a zero-resource multilingual model transfer method that requires neither target language training data nor cross-lingual resources. (Paper)

05.2017 - 08.2017, Facebook, Menlo Park, CA.
PhD Intern working with the Search NLP team
Developed an unsupervised aspect extractor for user reviews. E.g. Input: I like the food but the service is terrible. Extracted aspects: food (positive), service (negative).

05.2016 - 08.2016, IBM Research, San Jose, CA.
Research Intern with the Scalable NLP team
Devised a rectangle mining method for understanding the structure and semantics of tabular data in financial documents. (Paper published at ICDAR 2017.)

05.2015 - 08.2015, Google, Mountain View, CA.
PhD Intern with the Display Ads Predictive Targeting team
Implemented a new targeting model for recommending high-performing Ad keywords based on the slope-one collaborative filtering algorithm.

05.2012 - 02.2013, Microsoft Research Asia, Beijing, China.
Undergrad Research Intern with the Natural Language Computing team
Built a streamlined tool to automate and distribute training of machine translation pipeline. Participated in the first showcase of real-time speech-to-speech machine translation.

Teaching

TA for CS4700: Principles of Artificial Intelligence
TA for CS4300: Language and Information
TA for CS4740/5740: Introduction to Natural Language Processing
TA for CS2110: Object-Oriented Programming and Data Structures

Professional Services

EMNLP 2019 (PC member: Long Papers)
NAACL-HLT 2019 (PC member: Long and Short Papers)
RepL4NLP @ ACL 2019 (PC member)
EMNLP 2018 (PC member: Long Papers)
RepL4NLP @ ACL 2018 (PC member)
W-NUT @ EMNLP 2018 (PC member)

Miscellaneous

I cook on a daily basis when I’m stuck in Ithaca where good Chinese food is scarce.

I do archery (mostly indoor Olympic-style recurve archery) in my spare time, and sometimes go skiing when weather permits.