DAWN dataset
This is a temporal higher-order network dataset, which here means a
sequence of timestamped simplices where each simplex is a set of
nodes. The Drug
Abuse Warning Network (DAWN) is a national health surveillance
system that records drug use contributing to hospital emergency
department visits throughout the United States. Simplices in this
dataset are the drugs used by a patient (as reported by the patient)
in an emergency department visit. The drugs include illicit
substances, prescription and over-the-counter medication, and dietary
supplements. Timestamps of visits are recorded at the resolution of
quarter-years, spanning a total duration of 8 years. The timestamps
are encoded by the number (year * 4 + quarter), where year ranges from
2004 to 2011 and quarter ranges from 1 to 4.
- number of nodes: 2,558
- number of timestamped simplices: 2,272,433
- number of unique simplices: 143,523
- number of edges in projected graph: 122,963
- DAWN.tar.gz (timestamped simplices and node labels)
- DAWN-proj-graph.tar.gz (weighted projected graph)
- Simplicial closure and higher-order link prediction.
Austin R. Benson, Rediet Abebe, Michael T. Schaub, Ali Jadbabaie, and Jon Kleinberg.
Proceedings of the National Academy of Sciences (PNAS), 2018. [bibtex]