This repository holds the materials for a workshop on working with Twitter data in R. I first run it at Campus Luzern, in February 2021. I have since updated the material, and the last iteration of this workshop was at the University of Lucerne, in March 2022.
Below are the workshop materials kept in this repository. Except the Twitter data, all material is copyright Resul Umit, licensed CC-BY-SA 3.0. An easy to read summary of this permissive licence is available on the Creative Commons website.
Twitter's Developer terms allow for up to 1,500,000 Tweet IDs or 50,000 public Tweets to be deposited online. Hence the materials (compare status_ids.rds
and tweets.rds
below) fallow these terms.
-
data/mps.csv
- a dataset on the members of parliament (MPs) in the British House of Commons, as of January 2021
- it includes variables on electoral results as well as Twitter usernames
-
data/status_ids.rds
- a dataset with a single variable:
status_id
- lists the status IDs of all tweets posted by the MPs listed in
mps.csv
, during January 2021 - has 86,125 observations
- a dataset with a single variable:
-
data/tweets.rds
- similar to
data/status_ids
, except that- the time period is now limited to 15 to 31 January, reducing the number of observations below 50,000, allowing for all variables to be posted
- similar to
-
exercises/solutions.R
- solutions for exercises until the end of Part 4
-
exercises/tweets.Rmd
andexercises/tweets_answers.Rmd
- exercises and solutions for Part 6
-
exercises/userss.Rmd
andexercises/users_answers.Rmd
- exercises and solutions for Part 5
-
presentation/twtr_workshop.Rmd
,presentation/twtr_workshop.html
, andpresentation/twtr_workshop.pdf
- slides in three formats
-
presentation/twtr_workshop_files/
andpresentation/libs/
- files necessary to produce the slides