Manhattan-taxi-trajectories dataset

This dataset consists of 1,000 processed taxi trajectories over a one year period. Each trajectory is a sequence of neighborhoods in Manhattan visited by a taxi identified by its medallion number. Transitions correspond to pickups and dropoffs and are derived from a sequence of (pickup neighborhood, dropoff neighborhood, timestamp) records. If the next pickup is in the same area as the last dropoff, we do not count it as a transition. There is an additional "outside Manhattan" state that captures locations outside of the borough. The dataset was derived from data collected by Chris Whong via a Freedom of Information Law request.

Data files: If you use this data, please cite the following
  • The spacey random walk: a stochastic process for higher-order data.
    Austin R. Benson, David F. Gleich, and Lek-Heng Lim.
    SIAM Review 59:2, 321–345, 2017. [bibtex]
  • FOILing NYC’s Taxi Trip Data.
    Chris Whong.
    Accessed via