MIS41120 Statistical Learning

Academic Year 2022/2023

Broadly speaking, we think of Statistical and Machine Learning as computational methods that use (learn from) experience to improve performance or prediction accuracy. They arose in different research communities but have significant overlap. Statistical Learning focusses more on linear models, for which there is stronger theoretical foundation, and (to an extent) on inference; Machine Learning focusses more on nonlinear methods, founded more on experimental evidence, and is often more associated with prediction.
This Statistical Learning course discusses these, and also investigates the foundations of these methods: how well they work, error estimates, tradeoffs involved, etc: the principles underpinning algorithmic learning - the methods used in Knowledge Discovery and Data Mining.
Statistical learning refers to supervised and unsupervised learning, especially regression, classification, clustering, and especially with structured numerical data. These are the most common techniques used for modelling, with the goals of inference and prediction in business (and elsewhere); hence, their statistical theory is well-developed.
This module aims to develop both theory and practice to expert level.

Show/hide contentOpenClose All

Curricular information is subject to change

Learning Outcomes:

On completion of the module students should be able to:
● Distinguish between supervised and unsupervised learning and define regression, classification and clustering problems formally;
● Describe bias, variance and the bias-variance trade-off;
● Describe common loss functions and performance measures;
● Define the problem of overfitting and how to overcome it;
● Distinguish among common models, from linear regression to artificial neural networks to generalised linear models, and execute them with the help of a software library;
● Describe the main ideas of statistical learning theory, including the theory of the VC dimension.

Indicative Module Content:

Topics of the course are drawn from:
● Motivation: goals of prediction and inference/understanding
● Supervised and unsupervised learning: Regression, Classification, Clustering
● Measuring performance: accuracy and interpretability
● Bias, variance and the bias-variance tradeoff
● Generalisation and stability
● Model selection
● Loss functions
● The problem of Overfitting: Regularisation
● Sparse models including the lasso, elastic net and support vector machine
● Generalised linear models
● Artificial neural networks
● Deep nets
● Model capacity, shattering and VC dimension

Student Effort Type Hours
Lectures

36

Specified Learning Activities

40

Autonomous Student Learning

100

Total

176

Requirements, Exclusions and Recommendations

Not applicable to this module.


Module Requisites and Incompatibles
Not applicable to this module.
 
Assessment Strategy  
Description Timing Open Book Exam Component Scale Must Pass Component % of Final Grade
Assignment: Project work on data analysis Throughout the Trimester n/a Graded No

25

Examination: Main Examination 2 hour End of Trimester Exam No Standard conversion grade scale 40% No

75


Carry forward of passed components
Yes
 
Resit In Terminal Exam
Autumn Yes - 2 Hour
Feedback Strategy/Strategies

• Feedback individually to students, post-assessment
• Group/class feedback, post-assessment

How will my Feedback be Delivered?

Feedback on strengths and weaknesses of assignment submission

Name Role
Assoc Professor Peter Keenan Lecturer / Co-Lecturer

There are no rows to display

Discover our Rankings and Accreditations