trivago-clicks dataset
This is a hypergraph, where nodes are accomodations (mostly hotels), and hyperedges are sets of accommodations for which a user performed the "click-out" action during the same browsing session, which means the user was forwarded to a partner site. A few hyperedges are repeated. The dataset was derived from the ACM RecSys Challenge 2019. Each node is labeled with the country where the accomodation is located. Some summary statistics of the dataset are:
  • number of nodes: 172,738
  • number of hyperedges: 233,202
  • mean / median hyperedge size: 4.1 / 3.0
  • rank of hypergraph (maximum hyperedge size): 86
  • number of node classes: 160
Data: If you use this data, please cite the following papers:
  • Generative hypergraph clustering: from blockmodels to modularity.
    Philip S. Chodrow, Nate Veldt, and Austin R. Benson.
    Science Advances, 2021. [bibtex]