deep recursive neural networks
Paper : Deep Recursive Neural Networks for Compositionality in Language
O. Irsoy, C. Cardie NIPS, 2014, Montreal, Quebec.

TL;DR : We stack multiple recursive layers to construct a deep recursive net which outperforms traditional shallow recursive nets on sentiment detection.

This figure is supposed to summarize the whole idea.

Abstract : Recursive neural networks comprise a class of architecture that can operate on structured input. They have been previously successfully applied to model compositionality in natural language using parse-tree-based structural representations. Even though these architectures are deep in structure, they lack the capacity for hierarchical representation that exists in conventional deep feed-forward networks as well as in recently investigated deep recurrent neural networks. In this work we introduce a new architecture — a deep recursive neural network (deep RNN) — constructed by stacking multiple recursive layers. We evaluate the proposed model on the task of fine-grained sentiment classification. Our results show that deep RNNs outperform associated shallow counterparts that employ the same number of parameters. Furthermore, our approach outperforms previous baselines on the sentiment analysis task, including a multiplicative RNN variant as well as the recently introduced paragraph vectors, achieving new state-of-the-art results. We provide exploratory analyses of the effect of multiple layers and show that they capture different aspects of compositionality in language.

Poster : Presentation is available here.

Slides : Part of this talk I gave at Cornell AI seminar was based on this work.

Code : My C++ code is here. Please cite the paper if you use it.

Data : We use Stanford Sentiment Treebank for experimental evaluation. You should cite the appropriate paper by Socher et al if you use the dataset. I found this particular format to be easier to work with, and my code takes this format.

  title = {Deep Recursive Neural Networks for Compositionality in Language},
  author = {\.Irsoy, Ozan and Cardie, Claire},
  booktitle = {Advances in Neural Information Processing Systems 27},
  editor = {Z. Ghahramani and M. Welling and C. Cortes and N.D. Lawrence and K.Q. Weinberger},
  pages = {2096--2104},
  year = {2014},
  publisher = {Curran Associates, Inc.},
  url = {},
  location = {Montreal, Quebec}