Course readings are provided below. We do not expect you to read entries marked as “inspiration”. They are provided in case you have spare time and want to dig deeper into the topics. Kosuke Imai’s book is forthcoming at Princeton University Press. Professor Imai has kindly given us permission to use the textbook free of charge in advance of its official release. A PDF of the book is available through Absalon. Please do not circulate it.

August 8

Keywords: Introduction to SDS, Introduction to R.



If you’re interested, and want to delve deeper into coding and programming (you certainly don’t have to, they are not required for this course), I highly recommend the following posts

August 9 & 10

Keywords: Visualization, Data Manipulation, Data Import, Functions.


  • Grolemund, Garrett and Hadley Wickham. 2016. “R for Data Science”. Read chapters 3, 4 and 9. Browse chapter 15.

  • Imai, Kosuke. 2016. A First Course in Quantitative Social Science. Read chapter 4 section 1-3.1

Browse the following


Below are links to some interesting videos describing how companies such as the New York Times or FiveThirtyEight think about visualizing data as well as some posts and videos on the underlying theory behind the “tidyverse” and an introduction to working with spatial data in R.

August 11

Keywords: Web Scraping, API.



Below are some interesting academic papers using data scraped from online sources that might provide inspiration for your exam project.

August 12

Keywords: Big Data, Reproducible Research.


  • John Gerring. 2012. Measurements. Chapter 7 in Social Science Methodology, 2. Ed., Cambridge University Press. ((bad) copies will be provided)

  • Christine L. Borgman. Provocations, What Are Data and Data Scholarship in the Social Science. Chapters 1,2 and 6 in Big Data, Little Data, No Data. MIT Press 2015. (copies will be provided).

  • Einav and Levin: Economics in the Age of Big Data. Science. 2013. Link.

  • Edelman, Benjamin. 2012. “Using internet data for economic research.” The Journal of Economic Perspectives, 26.2: 189-206.

  • Anderson, Chris. 2008. “The end of theory: The data deluge makes the scientific method obsolete.” Wired, 16-07.

Read one of the following


August 15

Keywords: Observational data, Causation.

Keywords: Prediction, Statistical Learning


August 16

Keywords: Unsupervised Learning.

  • Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2001. “Introduction to statistical learning”. Vol. 1. Springer, Berlin: Springer series in statistics. (pages: 373-399).

August 17

Keywords: Privacy.



  • Alessandro Acquisti. 2015. The Economics and Behavioral Economics of Privacy. Chapter 3 in Privacy, Big Data, and the Public Good: Frameworks for Engagement (eds. Julia Lane, Victoria Stodden, Stefan Bender, Helen Nissenbaum). Cambridge University Press.

  • Fabian Neuhaus & Timothy Webmoor. 2012. “AGILE ETHICS FOR MASSIFIED RESEARCH AND VISUALIZATION.” Information, Communication & Society 15:1, 43-65

August 18

Keywords: Text Data.


  1. You will notice that Kosuke Imai uses the base R package much more frequently than we do. We will instead write R code that follows the “tidyverse” approach to data analysis. We do not expect you to repeat every line of code in the chapter, but you should have a rough idea of what each line of code does.