Learning Outcomes:
(a). Explain and illustrate properties of traditional and big data management and processing systems (ACID, CAP, BASE).
(b). Compare and contrast relational, NoSQL and newSQL database management systems.
(c). Describe, distinguish, and work with big data technologies for batch, stream, and graph processing.
(d). Develop (design and implement) NoSQL, batch, streaming and graph big data applications.
Indicative Module Content:
Introduction to Big Data (Characteristics and classifications)
Big Data reference architectures and Classification of Data Intensive Distributed Systems
Database Concepts and Architecture
The Relational Data Model, SQL, and Introduction to MySQL
Introduction to NoSQL Databases, MongoDB document NoSQL database
MapReduce Programming Model, Introduction to Apache Hadoop, HDFS and YARN
Distributed Data Processing using Apache Spark
Introduction to Graph processing and developing large graph applications using Spark's GraphX
Introduction to Data Streams and developing streaming applications using Spark's structured streaming API
Machine learning (supervised, unsupervised, recommendation) using Spark's MLlib API