sos-email-Eu-core dataset
This dataset is a collection of sequences of sets. Each sequence is derived from the sets of recipients on emails sent by a particular email address at a European research institution. Timestamps were recorded at a resolution of 1 second, and we consider the set of all receivers of an email from a given sender at a given timestamp to be a set. All sequences contain at least 10 sets, and only sets of size at most 5 are considered. Some basic statistics of this dataset are:
  • number of sequences: 681
  • number of unique elements appearing in sets: 937
  • number of sets: 202,769
  • number of unique sets: 9,694
Data: If you use this data, please cite the following papers:
  • Sequences of sets.
    Austin R. Benson, Ravi Kumar, and Andrew Tomkins.
    Proceedings of KDD, 2018. [bibtex]
  • Local Higher-order Graph Clustering.
    Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich.
    In Proceedings of KDD, 2017. [bibtex]
  • Graph Evolution: Densification and Shrinking Diameters.
    Jure Leskovec, Jon Kleinberg, and Christos Faloutsos.
    ACM Transactions on Knowledge Discovery from Data, 2007. [bibtex]