Department of Computer Science
E-Mail: [first name]@cs.cornell.edu
413 Gates Hall,
Ithaca, NY 14853-5169
I am a fourth year graduate student in the Department of Computer Science at Cornell University. I am easily distracted by (cover-letter-speak: my interests are) new or elegant ideas in Machine Learning, more broadly, Artificial Intelligence, with applications that grapple with ambiguity (e.g. Natural Language Processing) and/or adapt to an ever-changing worldview.
Following the parable of the Rabbit's Ph.D., I'm advised by Prof. Thorsten Joachims. He is flanked by Prof. Johannes Gehrke, Prof. Éva Tardos and Prof. Robert Kleinberg in my committee. My graduate minor is in Applied Mathematics.
I interned with the Computer Human Interactive Learning group at Microsoft Research Redmond
during the summer of 2013, Search Labs at Microsoft Research, Silicon Valley during the summer of 2012,
and worked as a strategist with Tower Research Capital
for 14 months from June 2010 - July 2011. I received a Bachelor of Technology degree in
Computer Science and Engineering from
Indian Institute of Technology, Bombay in 2010
and interned over the summer of 2009 at Microsoft India Development Centre
as a Software Development Engineer-Tester.
Before that, I played games and sports all day every day.
POEM: Policy Optimization for Exponential Models. [Coming soon:] POEM performs counterfactual risk minimization on logged bandit feedback data.
KESS: Keyphrase Extraction through Submodular Summarization. [Coming soon:] KESS extracts phrases that describe novel and influential ideas from a time-stamped corpus of texts.
[DEMO] Top-K Relation Extraction. [Coming soon:] Using off-the-shelf part-of-speech taggers (Stanford POS), parsers (Enju) and relation extractors (RECK), this on-line demo shows how to go beyond myopic inference for pipeline tasks.
[DEMO] Augmenting textbooks with videos. [Coming soon:] On-line demo for trying different approaches based on Latent Semantic Analysis to find appropriate videos for textbook sections (tutorial on LSA).
Temporal corpus summarization using submodular word coverage. CIKM 2012.
Ruben Sipos, Adith Swaminathan, Pannaga Shivaswamy and Thorsten Joachims.
Evolving corpora can be summarized through timelines indicating the influential elements. We created interesting timelines of articles in the ACL anthology and NIPS conference proceedings over a span of 15 years using landmark documents, influential authors and topic key-phrases. We achieved this by formulating a sub-modular objective extending the coverage problem over words (traditional summarization) to incorporate coverage across time.
Beyond myopic inference in Big Data pipelines. KDD 2013.
Karthik Raman, Adith Swaminathan, Johannes Gehrke and Thorsten Joachims.
Pipelines constructed using modular components have very brittle performance: errors in earlier components cascade through the pipeline causing catastrophic failures in the eventual output. We explored simple ideas from Probabilistic Graphical Models to make the inference procedure more robust, while still using off-the-shelf components to compose pipelines.
Mining videos from the web for electronic textbooks. ICFCA 2014.
Rakesh Agrawal, Maria Christoforaki, Sreenivas Gollapudi, Anitha Kannan, Krishnaram Kenthapadi and Adith Swaminathan.
There is a wealth of educational content available online, but the primary teaching aid for the vast majority of students (e.g. India and China) continues to be the textbook. Recognizing how textbook sections are organized and mining them for concepts using Formal Concept Analysis, we automatically retrieved appropriate videos from Youtube Edu. Using transcripts made available by educators to remain accessible to deaf students, we side-stepped the challenge of automatic audio transcription or content-based video retrieval.
This timeline collects some of the fun research projects I worked on, arranged chronologically. The icons represents Talks, references Unpublished research projects, and references Publications. The colors indicate:
Hex is a two-player board game
invented by Piet Hein and John Nash. The first player to move has an advantage (a winning strategy exists, but not known for 9x9 boards and larger).
Hex has a large branching factor in the size of the board, which make search-based AI for Hex challenging.
Advised by Prof. Amitabha Sanyal, Nikhil Hooda and I coded a functional AI for Hex in Scheme using micro- and macro-reductions in an abstract connection game, Y. Using differential calculus to infer approximate Probability of Ownership for unoccupied hexes, NASH consistently beat Hexy (2000 Olympiad Gold, AAAI 2000 Outstanding Paper) and Queenbee (2000 Olympiad Silver) [both use alpha-beta search based on virtual connections] on 7x7 boards playing as Player 1, and was competitive with both programs on 9x9 boards. Note, Olympiads are typically played on 11x11 boards and use the pie rule to mitigate Player 1 advantage, which wasn't simulated. The current state-of-the-art is MoHex, which uses Monte Carlo Tree Search. Read more about AI research for Hex here.
Advised by Prof. G. Sivakumar, Karthik Raman, Nikhil Hooda and I tackled the classic online K-server problem. We derived the competitive ratio for several online algorithms, and constructed request sequences that showed these ratio bounds were tight. We also created a variant of the Harmonic algorithm that was O(2k log k)-competitive and efficient. The state-of-the-art is the Work Function Algorithm which is O(2k-1) competitive (but very inefficient). We couldn't construct a request sequence to show this bound is tight. The K-server conjecture (there exists a deterministic O(k)-competitive algorithm) is open.
Samhita Kasula, Omkar Wagh, Karthik Raman and I were intrigued that we readily anthropomorphize pets but "intelligent entities" (say, Google) are still firmly considered tools. We studied affective computing, and presented work on simulating human-like emotions (e.g. incorporating prosody in Natural Language Processing -- a separate project that Nikhil Hooda, Prachur Goel and I worked on) to have more emotionally intelligent human-computer interactions. Read more about affective computing here.
Surreal numbers are a simple, constructive number system that generalize the reals, infinitesimals and infinite numbers, invented by John Conway. Following Conway's insight connecting surreal numbers to two player games, I demonstrated how surreals can be used to reason about Hex - however, this representation proved more inefficient than NASH's approximations. This talk won Gold at the Exposition General Championship at IIT Bombay. Read more about surreals and games here.
Advised by Prof. Soumen Chakrabarti, I built a reviewer-paper matching tool that integrated seamlessly with Microsoft CMT, ConfMaster, etc. The tool worked by scraping DBLP for recent publications of reviewers to build a text-profile, collecting citations from submissions, reviewer bids and topic/track/category matches from CMT etc. Optimal weighting of these different affinity cues was learned using transductive regression and using Max-Margin Markov Networks. The assignment was achieved using a min-cost max-flow problem. Modeling asymmetric costs (dissatisfied reviewers who don't get the papers they want vs. irritated reviewers who get papers they don't want) is important.