We study and model the process by which humans summarize creative documents (e.g.,from a movie script to a synopsis). We develop a customized topic model based on Poisson Factorization and inspired by the creativity literature, which links the text in a summary to the text in the original document. Traditional Poisson Factorization approximates documents as positive combinations of topics, i.e., as points in the cone defined by a set of topics (in the Euclidean space defined by the words in the vocabulary). The model proposed here captures not only this “inside the cone” portion of a document, but also the “outside the cone” portion that is not explained by a combination of common topics. The model captures how these two types of content are weighed in summaries as compared to full documents. In addition, it captures writing norms that influence the extent to which each topic appears in summaries compared to full documents. We apply this model to a dataset of marketing academic papers and their abstracts, and to a dataset of movie scripts and their synopses. We illustrate a practical application of our research by creating a public, online interactive tool meant to serve as a “sounding board” for users interested in writing summaries of creative documents.

Olivier Toubia is the Glaubinger Professor of Business at Columbia Business School. His research focuses on various aspects of innovation (including idea generation, preference measurement, and the diffusion of innovation), social networks and behavioral economics. He teaches a course on Customer-Centric Innovation and the core marketing course, in the MBA and Executive MBA programs. He received his MS in Operations Research and PhD in Marketing from MIT.