STAT40120 Categorical Data Analysis

Academic Year 2021/2022

Categorical data (e.g. number of accidents, number of successfully treated patients) arises in many areas of science, business and administration. This module introductes categorical data analysis and modelling. It includes description and discussion of traditional methods for analyzing one-way and two-way tables of counts and of simple tests based on the binomial distribution. It then introduces a general approach to categorical data analysis based on generalized linear models. It details the meaning of parameters in a range of typical linear models. There follow sections on regression for Poisson responses and rates, logistic regression, loglinear models for multiway contingency tables and extensions of logistic regression to nominal and ordinal multicategory models. In all cases a modelling approach is adopted. The module will include computer laboratories most weeks. The emphasis is both on the mathematical basis of such methods and on the practical use of such tools in data analysis.

Show/hide contentOpenClose All

Curricular information is subject to change

Learning Outcomes:

On completion of this module students should be able to:
- Propose an appropriate approach to the analysis of categorical data arising from a wide range of sources.
- Analyse binary data and its extensions to multicategory data, formulate and select an appropriate model and display and interpret the results of the model selected.
- Analyse count data (including multiway contingency tables), formulate and select an appropriate model and display and interpret the results of the model selected.
- In all modelling develop a parsimonious description of the data.
- Prepare a report of the analysis for a non-statistical client.

Indicative Module Content:

Introduction to categorical data;
Contingency tables; odds ratios; Fisher's exact test; McNemar's test; Maximum likelihood
Generalized linear models: logistic regression; probit regression ; Newton-Raphson; deviance
Poisson regression models
Multicategory logit models
Log linear models for contingency tables
Poisson regression for rates

Student Effort Hours: 
Student Effort Type Hours
Lectures

24

Tutorial

2

Computer Aided Lab

9

Specified Learning Activities

36

Autonomous Student Learning

60

Total

131

Approaches to Teaching and Learning:
Lectures; problem-based learning; computer laboratories; report writing. 
Requirements, Exclusions and Recommendations
Learning Requirements:

Knowledge of linear models to the level of STAT30240 is desirable.


Module Requisites and Incompatibles
Additional Information:
A knowledge of probability and statistical inference to the level of Probability Theory STAT20110 and Statistical Inference STAT20100 modules is required. Knowledge of calculus and linear algebra at First Science level is required.


 
Assessment Strategy  
Description Timing Open Book Exam Component Scale Must Pass Component % of Final Grade
Examination: End of trimester exam 2 hour End of Trimester Exam No Standard conversion grade scale 40% No

70

Continuous Assessment: There will be approximately 10 assignments some based on computer laboratories Throughout the Trimester n/a Standard conversion grade scale 40% No

30


Carry forward of passed components
No
 
Resit In Terminal Exam
Summer Yes - 2 Hour
Please see Student Jargon Buster for more information about remediation types and timing. 
Feedback Strategy/Strategies

• Feedback individually to students, post-assessment
• Group/class feedback, post-assessment

How will my Feedback be Delivered?

Assignments will be graded and annotated and discussed in tutorials.

Agresti A. (2007) An introduction to categorical data analysis. Wiley.

Agresti A. (2002) Categorical data analysis. Wiley.

Ugarte M, Militino A and Arnholt A. (2106). Statistics and Probability with R. 2nd edition. CRC Press.

Dalgaard, P. (2002) Introductory Statistics with R. Springer, pp.265

Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S. Springer, pp. 495.

Fleiss, J. (1973). Statistical methods for rates and proportions. Wiley.

Kleinbaum, D. (1994). Logistic regression. Springer-Verlag New York, Inc.
Name Role
Dr Fabian Ofurum Tutor