COMP47470 Big Data Programming

Academic Year 2023/2024

Big data refers to high-volume, high-velocity and/or high-variety data that is too complex to be dealt with by traditional (relational) data management and data processing systems. The data-intensive nature of big data applications has pushed research and industry practitioners to build innovative solutions that are inherently distributed software systems with novel programming and execution models. This module describes, compares and contrasts the pioneering and leading big data technologies (NoSQL, batch, streaming, and graph). It will teach students how to install a Big Data software technology (NoSQL, batch, streaming, graph) and develop (code and test) a big data application using the technology.

Show/hide contentOpenClose All

Curricular information is subject to change

Learning Outcomes:

(a). Explain and illustrate properties of traditional and big data management and processing systems (ACID, CAP, BASE).
(b). Compare and contrast relational, NoSQL and newSQL data management systems.
(c). Describe, distinguish, and use big data technologies for batch, stream, and graph processing.
(d). Develop (design and implement) NoSQL, batch, streaming and graph big data applications.

Indicative Module Content:

Introduction to Big Data (Characteristics and classifications)
Database Concepts, Architecture, Database Modelling and Design
The Relational Data Model, SQL, and Introduction to MySQL
Introduction to NoSQL Databases, MongoDB document NoSQL database
MapReduce Programming Model, Introduction to Apache Hadoop, HDFS
Distributed Data Processing using Apache Spark
Introduction to Graph processing and developing large graph applications using Spark's GraphX
Introduction to Data Streams and developing streaming applications using Spark's structured streaming API
Machine learning (supervised, unsupervised, recommendation) using Spark's MLlib API

Student Effort Hours: 
Student Effort Type Hours
Autonomous Student Learning








Approaches to Teaching and Learning:
Lectures, Laboratory Practicals, Weekly Quizzes, Mid-term Assignments, End-term Exam 
Requirements, Exclusions and Recommendations
Learning Recommendations:

It is strongly recommended that students have an acceptable competency level in bash scripting and Python programming language.

Module Requisites and Incompatibles
Not applicable to this module.
Assessment Strategy  
Description Timing Open Book Exam Component Scale Must Pass Component % of Final Grade
Examination: End of trimester Exam 2 hour End of Trimester Exam No Graded No


Continuous Assessment: < Description > Throughout the Trimester n/a Graded No


Carry forward of passed components
Remediation Type Remediation Timing
Repeat Within Two Trimesters
Please see Student Jargon Buster for more information about remediation types and timing. 
Feedback Strategy/Strategies

• Feedback individually to students, post-assessment
• Group/class feedback, post-assessment
• Online automated feedback

How will my Feedback be Delivered?

solutions and feedback to weekly quizzes, and to projects

Name Role
Mr Cormac Murray Tutor
Timetabling information is displayed only for guidance purposes, relates to the current Academic Year only and is subject to change.
Practical Offering 1 Week(s) - Autumn: All Weeks Fri 14:00 - 15:50
Lecture Offering 1 Week(s) - 1, 2, 3, 4, 5, 6, 7, 9, 12 Tues 11:00 - 12:50
Lecture Offering 1 Week(s) - 8, 10, 11 Tues 11:00 - 12:50
Practical Offering 1 Week(s) - 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33 Thurs 12:00 - 13:50
Lecture Offering 1 Week(s) - 20, 21, 25, 31 Tues 11:00 - 12:50
Lecture Offering 1 Week(s) - 22, 23, 24, 30, 32, 33 Tues 11:00 - 12:50
Lecture Offering 1 Week(s) - 26, 29 Tues 11:00 - 12:50
Practical Offering 1 Week(s) - 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33 Wed 11:00 - 12:50