Show/hide contentOpenClose All
Curricular information is subject to change
On successful completion of this module the learner will be able to:
- Review the data processing using Shell and traditional data management systems using SQL;
- Understand the problem of managing data at scale and why traditional data management systems are failing
- Understand the various data management paradigms used in the context of Big Data (e.g., relational, NoSQL)
- Understand the role of distributed file systems (e.g., using HDFS) that support big data programming
- Understand Big Data programming models such as Map/Reduce and Spark, and how to use them on real examples
- Understand other Spark extensions for various big data applications such as MLlib, GraphX, Spark Streaming, etc.
Student Effort Type | Hours |
---|---|
Lectures | 12 |
Practical | 24 |
Autonomous Student Learning | 64 |
Total | 100 |
Not applicable to this module.
Description | Timing | Component Scale | % of Final Grade | ||
---|---|---|---|---|---|
Group Work Assignment: A comparative study on solving a data-intensive task with and without big data programming. | n/a | Graded | No | 30 |
|
Exam (In-person): 2-hour closed-book paper-based exam | n/a | Graded | No | 70 |
Resit In | Terminal Exam |
---|---|
Summer | Yes - 2 Hour |
• Feedback individually to students, on an activity or draft prior to summative assessment
• Group/class feedback, post-assessment
• Self-assessment activities
solutions to lab practices will be provided;