We constructed a corpus of digitized texts containing about 4% of all books ever printed.Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.



Erez Lieberman Aiden is a fellow at the Harvard Society of Fellows and Visiting Faculty at Google. His work integrates mathematical and physical theory with the invention of new technologies. He recently invented a method for three-dimensional genome sequencing; he subsequently led the team that, in 2009, reported the first three dimensional map of the human genome. Together with collaborator Jean-Baptiste Michel, he developed culturomics, a quantitative approach to the study of history and culture that relies on computational analysis of a significant fraction of the historical record. This work led to the creation of the Google Ngram Viewer, a supplemental website that was visited over a million times in the 24 hours after its launch. Erez's research has won numerous awards, including the 2010 Hertz Thesis Prize; recognition for one of the top 20 "Biotech Breakthroughs that will Change Medicine", by Popular Mechanics; the Lemelson-MIT prize for the best student inventor at MIT; the American Physical Society's Award for the Best Doctoral Dissertation in Biological Physics; and membership in Technology Review's 2009 TR35. His last three papers have all appeared on the cover of Nature and Science. His work has also been featured on the front page of the New York Times, the Boston Globe, and the Wall Street Journal.


Culturomics: Quantitative Analysis of

Culture Using Millions of Digitized Books

Erez Lieberman-Aiden

Harvard Society of Fellows