Best Apache Spark Tutorials and Courses
Big Data Analysis with Scala and Spark (Coursera)
Taught By: Prof. Heather Miller (Carnegie Mellon University)
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 28 hours to complete
Course Description: This course covers the following topics that you will learn about:
- Read data from persistent storage and loading it into Apache Spark
- Manipulate data with Spark and Scala
- Expressing algorithms for data analysis in a functional style
- Recognizing how to avoid shuffles and recomputation in Spark
Prerequisite: You need to have some good experience in Programming.
Price: Both Paid and Free
Course Link: Visit the course here
Apache Spark with Scala - Hands On with Big Data! (Udemy)
Taught By: Frank Kane (Founder, Sundog Education)
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 9 hours to complete
Course Description: This course covers the following topics that you will learn about:
- Introduction to Scala
- Using Resilient Distributed Datasets (RDDs)
- SparkSQL, DataFrames, and DataSets
- Advanced examples of Spark programs
- Running Spark on a Cluster
- Machine Learning with Spark ML
- Intro to Spark Streaming
- Intro to GraphX
Prerequisite: You need to have some good experience in Programming. Also, basic knowledge about Scala would be beneficial, but not required.
Price: Rs. 700 INR (as of October 2020)
Course Link: Visit the course here
Taming Big Data with Apache Spark and Python - Hands On! (Udemy)
Taught By: Frank Kane (Founder, Sundog Education)
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 7 hours to complete
Course Description: This course covers the following topics that you will learn about:
- Getting started with Spark
- Spark Basics and the RDD Interface
- SparkSQL, DataFrames, and Datasets
- Advanced examples of Spark Programs
- Running Spark on a Cluster
- Machine Learning with Spark ML
- Spark Streaming, Structured Streaming, and GraphX
Prerequisite: You need to have some good experience in Programming. Python knowledge would be beneficial but not required.
Price: Rs. 700 INR (as of October 2020)
Course Link: Visit the course here
Spark and Python for Big Data with PySpark (Udemy)
Taught By: Jose Portilla (Head of Data Science, Pierian Data Inc.)
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 10.5 hours to complete
Course Description: This course covers the following topics that you will learn about:
- Setting up Python Spark
- Databricks Setup
- Local VirtualBox Set-up
- AWS EC2 PySpark Set-up
- AWS EMR Cluster Setup
- Introduction to Python
- Spark DataFrame Basics
- Introduction to Machine Learning with MLlib
- Linear Regression, Logistic Regression, Decision Trees and Random Forests, K-means Clustering, Collaborative Filtering
- Natural Language Processing
- Spark Streaming with Python
Prerequisite: You need to have a basic knowledge of Python Programming.
Price: Rs. 700 INR (as of October 2020)
Course Link: Visit the course here
Apache Spark Essential Training (LinkedIn Learning)
Taught By: Ben Sullins
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 1.5 hours to complete
Course Description: This course covers the following topics that you will learn about:
- Understanding Spark
- Understanding Data Interfaces
- Working with Text Files
- Loading CSV Data into DataFrames
- Using Spark SQL to analyze data
- Running Machine Learning Algorithms using MLlib
- Querying Streaming Data
- Connecting BI tools to Spark
Prerequisite: You need to have a good amount of Programming experience.
Price: Paid
Course Link: Visit the course here
Apache Spark Fundamentals (Pluralsight)
Taught By: Justin Pihony (Pluralsight Author)
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 4 hours 15 minutes to complete
Course Description: This course covers the following topics that you will learn about:
- Introduction to Big Data
- Introduction to Apache Spark
- Working with data in Apache Spark
- RDD in Apache Spark
- Accumulating and Caching Data
- Java in Spark
- AWS Setup
- Cluster Management
- GraphX
- Intro to Machine Learning with Spark
Prerequisite: You need to have a good amount of experience in programming.
Price: Paid
Course Link: Visit the course here
Big Data Analytics Using Spark (edX)
Taught By: Yoav Freund (Prof. of Computer Science and Engineering at UC Diego)
Course Type: Video
Course Level: Advanced
Course Duration: Approx. 10 weeks to complete
Course Description: This course covers the following topics that you will learn about:
- Programming Spark using PySpark
- Identifying the computational tradeoffs in a Spark application
- Performing data loading and cleaning using Spark and Parquet
- Modeling data through statistical and machine learning methods
Prerequisite: You need to have prior experience in programming and machine learning.
Price: Both Paid and Free
Course Link: Visit the course here
Become a Data Scientist by Learning Spark (Udacity)
Taught By: David Drummond (VP of Engineering at Insight) and Judit Lantos (Senior Data Engineer at Netflix)
Course Type: Video
Course Level: Intermediate
Course Duration: Approx. 10 hours to complete
Course Description: This course covers the following topics that you will learn about:
- Understanding the Big Data ecosystem
- Data Wrangling with Spark, such as SparkSQL and Spark DataFrames
- Debugging and Optimization
- Machine Learning with Spark
Prerequisite: You need to have a good amount of experience in Programming and Machine Learning.
Price: Free
Course Link: Visit the course here