Xanda Schofield's Page

Publications

As of July 2019, this list is no longer being updated.

Aaron Schein, Zhiwei Steven Wu, Alexandra Schofield, Mingyuan Zhou, and Hanna Wallach. Locally private Bayesian inference for count models. To appear in ICML 2019.
Alexandra Schofield, Aaron Schein, Zhiwei Steven Wu, and Hanna Wallach. A variational inference approach for locally private inference of Poisson factorization models. NeurIPS Workshop on Privacy Preserving Machine Learning (PPML), 2018.
Alexandra Schofield and Thomas Davidson. Identifying hate speech in social media. ACM Crossroads Magazine (XRDS) Vol 24 (2), 2017.
Alexandra Schofield, Laure Thompson, and David Mimno. Quantifying the effects of text duplication on semantic models. EMNLP, 2017.
Alexandra Schofield, Måns Magnusson, Laure Thompson, and David Mimno. Understanding text pre-processing for latent Dirichlet allocation. ACL Workshop for Women in NLP (WiNLP), 2017.
Alexandra Schofield, Måns Magnusson, and David Mimno. Pulling out the stops: Rethinking stopword removal for topic models. EACL, 2017.*
Alexandra Schofield and David Mimno. Comparing apples to apple: the effects of stemmers on topic models. TACL Vol. 4, 2016.**
Alexandra Schofield and Leo Mehr. Gender-distinguishing features in film dialogue. NAACL 2016 Workshop on Computational Linguistics for Literature (CLFL), 2016.***
Jack Hessel, Alexandra Schofield, Lillian Lee, David Mimno. What do vegans do in their spare time? Latent interest detection in multi-community networks. NeurIPS Networks Workshop, 2015.
Robert Keller, Alexandra Schofield, August Toman-Yih, Zachary Merritt, John Elliott. Automating the explanation of jazz chord progressions using idiomatic analysis. Computer Music Journal 37:4, 54-69, 2013.
Robert M Keller, August Toman-Yih, Alexandra Schofield, Zachary Merritt. A creative improvisational companion based on idiomatic harmonic bricks. Proc. 3rd ICCC, 2012.

Errata

* Uncited but relevant to this paper is Raphael Cohen et al.'s 2014 PLoS One paper, Redundancy-Aware Topic Modeling for Patient Record Notes, which develops a topic model, Red-LDA, that aims to combat the effects of text duplication.

** This work focuses on solely English, a fact which the title does not specify. While the analysis methodologies should generalize to other languages, the results discouraging stemming may not. See Chandler May et al.'s 2016 arXiv paper Analysis of Morphology in Topic Modeling for an example of a somewhat different result in Russian.

*** This paper uses a version of gender analysis (assuming binary genders and classifying gender using common baby name lists) that I would not recommend due to its lack of gender inclusiveness and inaccuracy. For a better approach, check out Brian Larson's 2017 EthNLP paper, Gender as a Variable in Natural-Language Processing: Ethical Considerations.

XandaSchofield

Publications

Errata

Xanda
Schofield