General description

Big Data describes datasets that are either too big or too fast or both to be processed on a single computer. “TI2736-B Big Data Processing” provides a practical introduction to systems and algorithms used to process Big Data.

Learning objectives

[all students] After the end of the course, all students should be able to:

[BSc students] - Describe in which scenaria streaming algorithms are most applicable - Identify the correct streaming algorithm for a given streaming problem

[minor version] - Design and apply basic data processing pipelines - Understand basic data analysis concepts (such as aggregation, correlation and linear modelling)

Course Organization


Week Lecture Who? Topic Teacher Homework
1 1 All Course introduction, Big data in the real world GG
1 2 All Progamming techniques for Big Data GG
2 1 All Distributed storage GG
2 2 All Distrubuted databases GG
3 1 BSc Stream processing JH
3 2 BSc Stream processing systems JH
3 1 Minors Introduction to Data Processing GG
3 2 Minors Intoduction to Data science GG
4 1 All Map/Reduce algorithms GG
4 1 All Hadoop and friends GG
5 1 All Spark RDDs GG
5 2 All Pair RDDs and Dataframes GG
6 1 All Algorithms on Spark GG
6 2 All Iterative algorithms on Spark GG
7 1 All Big Graphs GG
7 2 All Graph processing systems GG



