Learning Outcomes:
Upon successful completion of the course, students will be able to:
1. Understand fundamental issues in (quantitative) text analysis such as inter-coder agreement, reliability, validation, accuracy, and precision.
2. Convert texts into quantitative matrices of features, and then analyse those features using statistical methods, topic models, and scaling approaches.
3. Use human coding of texts to train supervised classifiers and fine-tune transformer models.
4. Apply these methods to their own text corpus to address a substantive research question.
5. Critically evaluate (social science) research that uses automated text analysis methods.
Indicative Module Content:
Statistical software and programming using R and RMarkdown; assumptions and workflow of quantitative text analysis approaches; tokenisation and document-feature matrix; dictionaries and sentiment analysis; describing and comparing texts; human coding and document classification; supervised and unsupervised scaling; multilingual text analysis; topic models; speech recognition; word embeddings