Learning Outcomes:
Upon successful completion of the course, students will be able to:
1. Understand fundamental issues in (quantitative) text analysis, such as inter-coder agreement, reliability, and validation.
2. Convert texts into quantitative matrices of features, and then analyse those features using statistical methods.
3. Use human coding of texts to train supervised classifiers.
4. Apply these methods to a text corpus to address a substantive research question.
5. Critically evaluate (social science) research that uses automated text analysis methods.
Indicative Module Content:
Statistical software and programming using R and RMarkdown; assumptions and workflow of quantitative text analysis approaches; tokenisation and document-feature matrix; dictionaries and sentiment analysis; describing and comparing texts; human coding and document classification; supervised and unsupervised scaling; multilingual text analysis; topic models; speech recognition; word embeddings