Introduction

Georgia Tech Big Data Analytics Bootcamp

Welcome

Welcome to the Big Data Bootcamp. In this bootcamp training, you will learn about big data analytics, including Python data analytics tools and the Spark ecosystem for large-scale data.

The training materials are developed by Sunlab, Polo Club, and Dr. Chao Zhang’s lab. To get started, please follow the instructions on the left to setup the learning environment first.

Logistics

Schedule

This bootcamp consists of a set of mini-lectures and practice sessions, which will cover mainstay topics on big data analytics. Below is the detailed schedule of this two-day bootcamp.

Date Time Topic Instructor Recording
Day 1 9am - 9:50am Bootcamp Overview Chao recording
10am - 10:50am Python Tools for Data Analysis Yinghao recording
11am - 11:50am Practice Session
12pm - 1pm Lunch
1pm - 1:50pm Spark Introduction Chao recording
2pm - 2:50pm Environment Setup Yinghao & Yuchen recording
3pm - 3:50pm Practice Session
Day 2 9am - 9:50am Scala & Spark Basics Yinghao recording
10am - 10:50am Spark SQL & GraphX Yuchen recording
11am - 11:50am Practice Session recording
12pm - 1pm Lunch
1pm - 1:50pm Predictive Models Chao recording
2pm - 2:50pm Spark ML Lib Yuchen recording
3pm - 3:50pm Practice Session
3:50pm - 4pm Wrapup Chao