Learning Outcomes:
On completion of the module students should be able to:
● Distinguish between supervised and unsupervised learning and define regression, classification and clustering problems formally;
● Describe bias, variance and the bias-variance trade-off;
● Describe common loss functions and performance measures;
● Define the problem of overfitting and how to overcome it;
● Distinguish among common models, from linear regression to artificial neural networks to generalised linear models, and execute them with the help of a software library;
● Describe the main ideas of statistical learning theory, including the theory of the VC dimension.
Indicative Module Content:
Topics of the course are drawn from:
● Motivation: goals of prediction and inference/understanding
● Supervised and unsupervised learning: Regression, Classification, Clustering
● Measuring performance: accuracy and interpretability
● Bias, variance and the bias-variance tradeoff
● Generalisation and stability
● Model selection
● Loss functions
● The problem of Overfitting: Regularisation
● Sparse models including the lasso, elastic net and support vector machine
● Generalised linear models
● Artificial neural networks
● Deep nets
● Model capacity, shattering and VC dimension