CS 1380 + ORIE 1380 + STSCI 1380
Data Science For All
Spring 2021

Catalog Description: This course provides an introduction to data science. Given data from economics, medicine, biology, or physics, collected from internet denizens, survey respondents, or wireless sensors, how can one understand the phenomenon generating the data, make predictions, and improve decisions? We focus on building skills in inferential thinking and computational thinking, guided by the practical questions we seek to answer. The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world datasets, including economic data, document collections, geographical data, and social networks. We will also consider social issues in data analysis such as privacy and design.

Lecture: MWF 10:10-11:00am, Kennedy Hall 116 (Call Auditorium), starting February 8, 2021.

Instructor: David P. Williamson

Topics and Course Objectives

A schedule of lectures and assignments is available.

This course is positioned at the intersection of computing and statistics, with an emphasis on empirical analysis of real-world data sets through computation, rather than mathematical theory. There are no prerequisites other than high-school algebra. As the Cornell motto says, we welcome “Any person…any study.”

The course is organized into three units:

Outcomes:

Answers to Some Important Questions

Q: Is it ok if I’m undeclared? Or if I’m majoring in something other CS, ORIE, or Stats?
A: Yes! All majors are welcome, especially those from outside CS, ORIE, and Stats!

Q: Do I need to know a lot of math for 1380?
A: Basic high school mathematics (e.g., Algebra I and II) is all you need. We won’t use any calculus in 1380.

Q: Do I need to know how to program for 1380?
A: Nope! We’ll teach you everything you need to know.

Q: Can I take 1380 if I’ve already taken a class on introductory programming (e.g., CS 1110) or stats (e.g., AEM 2100, ENGRD 2700, HADM 2010, ILRST 2100, MATH 1710, PAM 2100, STSCI 2100)?
A: Yes, but if you’ve taken both programming and stats before, you’re likely to find 1380 to move too slowly for you. You could instead consider INFO 2950, INFO 3300, CS 4780, ORIE 4740, ORIE 4741, or STSCI 4060.

Acknowledgment

This course is based on Data 8, a course taught by Ani Adhikari and John DeNero at the University of California, Berkeley. They and their teaching assistants have developed many of the materials we are using in our own course. We are using those materials with their permission, which we gratefully acknowledge.