STAT30240 Predictive Analytics I

Academic Year 2021/2022

Topics covered:

1. Matrix revision and Exploratory data analysis;

2. Simple linear regression (SLR): properties of least squares estimates; t-tests; F-tests; Confidence intervals; Prediction intervals; Complete SLR analysis in the R statistical software;

3. Multiple linear regression (MLR): properties of least squares estimates; t-tests; F-tests; Confidence intervals; Prediction intervals; Complete MLR analysis in the R statistical software;

4. Categorical Predictors and Interactions

5. Analysis of Variance

6. Regression Diagnostics

7. Variable Selection and Model Building

All the material is supplemented with its implementation in the R programming language which is rated 7th in IEEE list of top programming languages.

Show/hide contentOpenClose All

Curricular information is subject to change

Learning Outcomes:

By the end of the module students should be able to:
(i) Interpret scatterplots for bivariate data.
(ii) Define the correlation coefficient for bivariate data.
(iii) Explain the interpretation of the correlation coefficient for bivariate data and perform statistical inference as appropriate.
(iv) Calculate the correlation coefficient for bivariate data.
(v) Explain what is meant by response and explanatory variables.
(vi) Derive the least squares estimates of the slope and intercept parameters in a simple linear regression model.
(vii) Perform statistical inference on the slope parameter.
(viii) Describe the use of measures of goodness of fit of a linear regression model.
(ix) Use a fitted linear relationship to predict a mean response or an individual response with confidence limits
(x) Use residuals to check the suitability and validity of a linear regression model.
(xi) State the multiple linear regression model (with several explanatory variables).
(xii) Use appropriate software to fit a multiple linear regression model to a data set and interpret the output.
(xiii) Use measures of model fit to select an appropriate set of explanatory variables.

Indicative Module Content:

Student Effort Hours: 
Student Effort Type Hours
Lectures

18

Tutorial

10

Laboratories

10

Autonomous Student Learning

72

Total

110

Approaches to Teaching and Learning:
Weekly Lectures;
Weekly Labs covering materials implementation in R;
Weekly tutorials;

One Assignment
One Exam 
Requirements, Exclusions and Recommendations
Learning Requirements:

A good understanding of statistics at an introductory level, including t-tests, correlation and covariance, and properties of the expectation and variance operators


Module Requisites and Incompatibles
Incompatibles:
FIN30520 - Machine Learning Finance, STAT40790 - Predictive Analytics I (online


 
Assessment Strategy  
Description Timing Open Book Exam Component Scale Must Pass Component % of Final Grade
Assignment: Project 1.
Exploratory Data Analysis
Regression Modelling
Varies over the Trimester n/a Alternative linear conversion grade scale 40% No

20

Examination: Written examination 2 hour End of Trimester Exam No Alternative linear conversion grade scale 40% Yes

80


Carry forward of passed components
No
 
Resit In Terminal Exam
Spring Yes - 2 Hour
Please see Student Jargon Buster for more information about remediation types and timing. 
Feedback Strategy/Strategies

• Group/class feedback, post-assessment

How will my Feedback be Delivered?

The Assignment has class feedback posted on Brightspace or discussed in class.

1. Linear Models with R, Second Edition by Julian J. Faraway
2. Applied Regression Analysis and Generalized Linear Models, Third Edition by John Fox
3. An R Companion to Applied Regression, Third Edition by John Fox
4. An R Companion to Linear Statistical Models by Christopher Hay-Jahans
Name Role
Dr Conor Finnegan Lecturer / Co-Lecturer
Gerardina Celentano Tutor
Ms Courtney Clarke Tutor
Kate Finucane Tutor
Hardeep Kaur Tutor
Catherine Mahoney Tutor
Uche Mbaka Tutor