DAWN dataset
This is a temporal higher-order network dataset, which here means a sequence of timestamped simplices where each simplex is a set of nodes. The Drug Abuse Warning Network (DAWN) is a national health surveillance system that records drug use contributing to hospital emergency department visits throughout the United States. Simplices in this dataset are the drugs used by a patient (as reported by the patient) in an emergency department visit. The drugs include illicit substances, prescription and over-the-counter medication, and dietary supplements. Timestamps of visits are recorded at the resolution of quarter-years, spanning a total duration of 8 years. The timestamps are encoded by the number (year * 4 + quarter), where year ranges from 2004 to 2011 and quarter ranges from 1 to 4.
  • number of nodes: 2,558
  • number of timestamped simplices: 2,272,433
  • number of unique simplices: 143,523
  • number of edges in projected graph: 122,963
Data: If you use this data, please cite the following paper:
  • Simplicial closure and higher-order link prediction.
    Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg.
    Proceedings of the National Academy of Sciences (PNAS), 2018. [bibtex]