
Course Material
 01/23: Introduction (slides)
 Overview of course topics
 Course administration and grading
 Warmup assignment
 01/28: Generative vs. Discriminative Supervised Learning (slides)
 Template for project idea pitch and project guidelines.
 Generative codels, conditional probabilistic models, decision models.
 Maximumlikelihood estimation and empirical risk minimization.
 Naive Bayes, logisitic regression, and support vector machines.
 01/30: Generative Hidden Markov Models (slides)
 Representation and assumptions of HMMs
 Maximumlikelihood estimation of HMMs
 Most probable configurations and Viterbi algorithm
 Reading: Koller, Friedman, Getoor, Taskar, “Graphical Models in a Nutshell”. (paper)
 02/04: Project Pitches
 See Piazza for the slides.
 02/06: I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, Support Vector Machine Learning for Interdependent and Structured Output Spaces, ICML, 2004. (paper) (slides)
 02/11: Ben Taskar, Carlos Guestrin and Daphne Koller. MaxMargin Markov Networks. NIPS, 2004. (paper) (slides)
 02/11: D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng. Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data. CVPR, 2005. (paper) (slides)
 02/20: Matthew Blaschko, Christoph Lampert. Learning to Localize Objects with Structured Output Regression. ECCV, 2008. (paper) (slides)
 02/20: John Lafferty, Andrew McCallum, Fernando Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML, 2001. (paper) (slides)
 02/25: Nathan Ratliff, Andrew Bagnell, Martin Zinkevich. Maximum Margin Planning. ICML, 2006. (paper) (slides)
 02/25: J. Weston, O. Chapelle, A. Elisseeff, B. Schoelkopf and V. Vapnik, Kernel Dependency Estimation, NIPS, 2002. (paper) (slides)
 02/27: T. Joachims. A Support Vector Method for Multivariate Performance Measures. ICML, 2005. (paper) (slides)
 02/27: Yisong Yue, T. Joachims. Predicting Diverse Subsets Using Structural SVMs. ICML, 2008. (paper) (slides)
 03/04: T. Joachims, L. Granka, Bing Pan, H. Hembrooke, F. Radlinski, G. Gay. Evaluating the Accuracy of Implicit Feedback from Clicks and Query Reformulations in Web Search, ACM Transactions on Information Systems (TOIS), Vol. 25, No. 2 (April), 2007. (paper) (slides)
 03/06: B. Carterette, P. Bennett, D. Chickering, S. Dumais. Here or There: Preference Judgments for Relevance. ECIR, 2008. (paper) (slides)
 03/06: O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, LargeScale Validation and Analysis of Interleaved Search Evaluation, ACM Transactions on Information Systems (TOIS), 30(1):6.16.41, 2012. (paper) (slides)
 03/11: Yisong Yue, J. Broder, R. Kleinberg, T. Joachims. The Karmed Dueling Bandits Problem. JCSS, 2012. (paper) (slides)
 03/13: P. Shivaswamy, T. Joachims. Online Structured Prediction via Coactive Learning, ICML, 2012. (paper) (slides)
 03/18: E. Agichtein, E. Brill, S. T. Dumais and R. Ragno. Learning user interaction models for predicting web search preferences. SIGIR, 2006. (paper) (slides)
 03/18: O. Chapelle and Y. Zhang. A dynamic Bayesian network click model for web search ranking. WWW Conference, 2009. (paper) (slides)
 03/20: Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona, Serge Belongie. Visual Recognition with Humans in the Loop. ECCV, 2010. (paper) (slides)
 03/20: Seyda Ertekin, Haym Hirsh, Cynthia Rudin. Selective Sampling of Labelers for Approximating the Crowd. AAAI Fall Symposium, 2012. (paper) (slides)
 3/25: Chris Piech, Jonathan Huang, Zhenghao Chen, Chuong Do, Andrew Ng, Daphne Koller. Tuned Models of Peer Assessment in MOOCs. EDM, 2013. (paper) (slides)
 3/27: Shuo Chen, Joshua Moore, Douglas Turnbull, Thorsten Joachims, Playlist Prediction via Metric Embedding, ACM Conference on Knowledge Discovery and Data Mining (KDD), 2012. (paper) (slides)
 3/27: J. Moore, Shuo Chen, T. Joachims, D. Turnbull, Taste over Time: the Temporal Dynamics of User Preferences, Conference of the International Society for Music Information Retrieval (ISMIR), 2013. (paper) (slides)
 4/8: Yoshua Bengio, Rejean Ducharme, Pascal Vincent, Christian Jauvin. A Neural Probabilistic Language Model. JMLR, Vol 3, 2003. (paper) (slides) (slides)
 4/8: Jason Weston, Samy Bengio, Nicolas Usunier. WSABIE: Scaling Up To Large Vocabulary Image Annotation. IJCAI, 2011. (paper) (slides)
 4/10: Tomas Mikolov, Ilyu Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS, 2013. (paper) (slides)
 4/10: Richard Socher, Brody Huval, Christopher Manning, Andrew Y. Ng. Semantic compositionality through recursive matrixvector spaces. EMNLP, 2012. (paper) (slides)
 4/15: D. Blei, A. Ng, M. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research (JMLR), 3(5):993–1022, 2003. (paper) (slides)
 4/15: Prem Gopalan, Jake Hofman, David Blei. Scalable Recommendation with Poisson Factorization. Online report, 2013. (paper) (slides)
 4/17: Steffen Rendle, Lars SchmidtThieme. Pairwise Interaction Tensor Factorization for Personalized Tag Recommendation. WSDM, 2010. (paper) (slides)


Reference Material
Structured Output Prediction
 I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, Support Vector Machine Learning for Interdependent and Structured Output Spaces, ICML, 2004. (paper)
 Ben Taskar, Carlos Guestrin and Daphne Koller. MaxMargin Markov Networks. NIPS, 2004. (paper)
 D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng. Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data. CVPR, 2005. (paper)
 Andrew McCallum, Dayne Freitag, and Fernando Pereira. Maximum entropy Markov models for information extraction and segmentation. ICML, 2000. (paper)
 John Lafferty, Andrew McCallum, Fernando Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML, 2001. (paper)
 ChunNam John Yu, T. Joachims, R. Elber, J. Pillardy. Support Vector Training of Protein Alignment Models. Journal of Computational Biology, 15(7): 867880, September 2008. (paper)
 Brooke Cowan, Ivona Kucerova, and Michael Collins, A Discriminative Model for TreetoTree Translation, EMNLP 2006. (paper)
 Yisong Yue, T. Finley, F. Radlinski, T. Joachims. A Support Vector Method for Optimizing Average Precision. SIGIR, 2007. (paper)
 Yisong Yue, T. Joachims. Predicting Diverse Subsets Using Structural SVMs. ICML, 2008. (paper)
 Matthew Blaschko, Christoph Lampert. Learning to Localize Objects with Structured Output Regression. ECCV, 2008. (paper)
 Rajhans Samdani, Dan Roth. Efficient Decomposed Learning for Structured Prediction. ICML, 2012 (paper)
 A. Fix, T. Joachims, S. Park, R. Zabih. Structured learning of sumofsubmodular higher order energy functions. ICCV, 2013. (paper)
 Nathan Ratliff, Andrew Bagnell, Martin Zinkevich. Maximum Margin Planning. ICML, 2006. (paper)
 Ulf Brefeld, Tobias Scheffer, SemiSupervised Learning for Structured Output Variables, ICML, 2006. (paper)
 J. Weston, O. Chapelle, A. Elisseeff, B. Schoelkopf and V. Vapnik, Kernel Dependency Estimation, NIPS, 2002. (paper)
 Hal Daume, John Langford, Daniel Marcu, Searchbased Structured Prediction, Machine Learning, 2009. (paper)
 Matthew Richardson, Pedro Domingos, Markov Logic Networks, Machine Learning, Vol. 62, Number 12, pp. 107136, 2006. (paper)
 Kuzman Ganchev, Joao Graca, Jennifer Gillenwater, Ben Taskar. Posterior Regularization for Structured Latent Variable Models. JMLR, 10, 2010. (paper)
 ChunNam Yu, Thorsten Joachims. Learning Structural SVMs with Latent Variables. ICML 2009. (paper)
Machine Learning with Humans in the Loop
 T. Joachims, L. Granka, Bing Pan, H. Hembrooke, F. Radlinski, G. Gay. Evaluating the Accuracy of Implicit Feedback from Clicks and Query Reformulations in Web Search, ACM Transactions on Information Systems (TOIS), Vol. 25, No. 2 (April), 2007. (paper)
 Ben Carterette, Rosie Jones. Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks. NIPS, 2007. (paper)
 F. Radlinski, M. Kurup, T. Joachims. How Does Clickthrough Data Reflect Retrieval Quality? CIKM, 2008. (paper)
 O. Chapelle, T. Joachims, F. Radlinski, Yisong Yue, LargeScale Validation and Analysis of Interleaved Search Evaluation, ACM Transactions on Information Systems (TOIS), 30(1):6.16.41, 2012. (paper)
 O. Chapelle and Y. Zhang. A dynamic Bayesian network click model for web search ranking. WWW Conference, 2009. (paper)
 E. Agichtein, E. Brill, S. T. Dumais and R. Ragno. Learning user interaction models for predicting web search preferences. SIGIR, 2006. (paper)
 D. Beeferman, A. Berger. Agglomerative clustering of search engine query logs. KDD, 2000. (paper)
 Alex Strehl, John Langford, Sham Kakade, Lihong Li. Learning from Logged Implicit Exploration Data. NIPS, 2010. (paper)
 Yisong Yue, J. Broder, R. Kleinberg, T. Joachims. The Karmed Dueling Bandits Problem. JCSS, 2012. (paper)
 Abner GuzmanRivera, Dhruv Batra, Pushmeet Kohli. Multiple Choice Learning: Learning to Produce Multiple Structured Outputs, NIPS, 2012. (paper)
 B. Carterette, P. Bennett, D. Chickering, S. Dumais. Here or There: Preference Judgments for Relevance. ECIR, 2008. (paper)
 P. Shivaswamy, T. Joachims. Online Structured Prediction via Coactive Learning, ICML, 2012. (paper)
 A. Jain, B. Wojcik, T. Joachims, A. Saxena. Learning Trajectory Preferences for Manipulators via Iterative Improvement. NIPS, 2013. (paper)
 Seyda Ertekin, Haym Hirsh, Cynthia Rudin. Selective Sampling of Labelers for Approximating the Crowd. AAAI Fall Symposium, 2012. (paper)
 Steve Branson, Catherine Wah, Florian Schroff, Boris Babenko, Peter Welinder, Pietro Perona, Serge Belongie. Visual Recognition with Humans in the Loop. ECCV, 2010. (paper)
 Chris Piech, Jonathan Huang, Zhenghao Chen, Chuong Do, Andrew Ng, Daphne Koller. Tuned Models of Peer Assessment in MOOCs. EDM, 2013. (paper)
 Ruben Sipos, Arpita Ghosh, Thorsten Joachims. Was This Review Helpful to You? It Depends! Context and Voting Patterns in Online Content. WWW, 2014. (paper)
Learning Representations
 Shuo Chen, Joshua Moore, Douglas Turnbull, Thorsten Joachims, Playlist Prediction via Metric Embedding, ACM Conference on Knowledge Discovery and Data Mining (KDD), 2012. (paper)
 J. Moore, Shuo Chen, T. Joachims, D. Turnbull, Taste over Time: the Temporal Dynamics of User Preferences, Conference of the International Society for Music Information Retrieval (ISMIR), 2013. (paper)
 Geoffrey Hinton, Sam Roweis. Stochastic Neighbor Embedding. NIPS, 2002. (paper)
 David Gleich, Matthew Rasmussen, Kevin Lang, and Leonid Zhukov. The world of music: User ratings; spectral and spherical embeddings; map projections. Online report, 2006. (paper)
 John Platt. Fast Embedding of Sparse Music Similarity Graphs. NIPS, 2004. (paper)
 Steffen Rendle, Lars SchmidtThieme. Pairwise Interaction Tensor Factorization for Personalized Tag Recommendation. WSDM, 2010. (paper)
 Prem Gopalan, Jake Hofman, David Blei. Scalable Recommendation with Poisson Factorization. Online report, 2013. (paper)
 Thomas Hofmann. Probabilistic Latent Semantic Indexing. SIGIR, 1999. (paper)
 D. Blei, A. Ng, M. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research (JMLR), 3(5):993–1022, 2003. (paper)
 Tomas Mikolov, Ilyu Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. NIPS, 2013. (paper)
 Ding Zhou, Shenghuo Zhu, Kai Yu, Xiaodan Song, Belle Tseng, Hongyuan Zha, Lee Giles. Learning Multiple Graphs for Document Recommendations. WWW, 2008. (paper)
 Amir Globerson, Gal Chechik, Fernando Pereira, Naftali Tishby. Euclidean Embedding of Cooccurrence Data. JMLR, Vol 8, 2007. (paper)
 Yoshua Bengio, Rejean Ducharme, Pascal Vincent, Christian Jauvin. A Neural Probabilistic Language Model. JMLR, Vol 3, 2003. (paper)
 Andriy Mnih, Geoffrey Hinton. Three new graphical models for statistical language modelling. ICML, 2007. (paper)
 Richard Socher, Brody Huval, Christopher Manning, Andrew Y. Ng. Semantic compositionality through recursive matrixvector spaces. EMNLP, 2012. (paper)
 Eric Huang, Richard Socher, Christopher Manning, Andrew Y. Ng. Improving word representations via global context and multiple word prototypes. ACL, 2012. (paper)
 Jason Weston, Samy Bengio, Nicolas Usunier. WSABIE: Scaling Up To Large Vocabulary Image Annotation. IJCAI, 2011. (paper)
 Andriy Mnih, Yee Whye Teh. A fast and simple algorithm for training neural probabilistic language models. ICML, 2012. (paper)
