amazon-reviews dataset
The amazon-reviews network is a hypergraph where hyperedges are sets of products reviews on Amazon, as collected by Jianmo Ni, Jiacheng Li, and Julian McAuley (specifically, we use the collection of 5-core datasets). Nodes are labeled by their product categories. Some summary statistics of the dataset are:
  • number of nodes: 2,268,231
  • number of hyperedges: 4,285,363
  • mean / median hyperedge size: 17.1 / 8
  • rank of hypergraph (maximum hyperedge size): 9,350
  • number of node classes: 29
Data files: If you use this data, please cite the following paper:
  • Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects.
    Jianmo Ni, Jiacheng Li, Julian McAuley.
    EMNLP, 2019. [bibtex]