Curriculum
15 Sections
72 Lessons
52 Weeks
Expand all sections
Collapse all sections
Introduction
1
2.1
You, this course and Us
2 Minutes
Why is Big Data a Big Deal
6
3.1
The Big Data Paradigm
14 Minutes
3.1
Serial vs Distributed Computing
9 Minutes
3.1
What is Hadoop?
7 Minutes
3.1
HDFS or the Hadoop Distributed File System
11 Minutes
3.1
MapReduce Introduced
12 Minutes
3.1
YARN or Yet Another Resource Negotiator
4 Minutes
Installing Hadoop in a Local Environment
3
4.1
Hadoop Install Modes
9 Minutes
4.1
Hadoop Standalone mode Install
16 Minutes
4.1
Hadoop Pseudo-Distributed mode Install
12 Minutes
The MapReduce "Hello World"
7
5.1
The basic philosophy underlying MapReduce
9 Minutes
5.1
MapReduce – Visualized And Explained
9 Minutes
5.1
MapReduce – Digging a little deeper at every step
10 Minutes
5.1
“Hello World” in MapReduce
10 Minutes
5.1
The Mapper
10 Minutes
5.1
The Reducer
8 Minutes
5.1
The Job
12 Minutes
Run a MapReduce Job
2
6.1
Get comfortable with HDFS
11 Minutes
6.1
Run your first MapReduce Job
14 Minutes
Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API
6
7.1
Parallelize the reduce phase – use the Combiner
15 Minutes
7.1
Not all Reducers are Combiners
14 Minutes
7.1
How many mappers and reducers does your MapReduce have?
8 Minutes
7.1
Parallelizing reduce using Shuffle And Sort
15 Minutes
7.1
MapReduce is not limited to the Java language – Introducing the Streaming API
5 Minutes
7.1
Python for MapReduce
12 Minutes
HDFS and Yarn
7
8.1
HDFS – Protecting against data loss using replication
16 Minutes
8.1
HDFS – Name nodes and why they’re critical
7 Minutes
8.1
HDFS – Checkpointing to backup name node information
11 Minutes
8.1
Yarn – Basic components
9 Minutes
8.1
Yarn – Submitting a job to Yarn
13 Minutes
8.1
Yarn – Plug in scheduling policies
14 Minutes
8.1
Yarn – Configure the scheduler
13 Minutes
MapReduce Customizations For Finer Grained Control
4
9.1
Setting up your MapReduce to accept command line arguments
14 Minutes
9.1
The Tool, ToolRunner and GenericOptionsParser
13 Minutes
9.1
Configuring properties of the Job object
11 Minutes
9.1
Customizing the Partitioner, Sort Comparator, and Group Comparator
15 Minutes
The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!
6
10.1
The heart of search engines – The Inverted Index
15 Minutes
10.1
Generating the inverted index using MapReduce
11 Minutes
10.1
Custom data types for keys – The Writable Interface
10 Minutes
10.1
Represent a Bigram using a WritableComparable
13 Minutes
10.1
MapReduce to count the Bigrams in input text
9 Minutes
10.1
Test your MapReduce job using MRUnit
30 Minutes
Input and Output Formats and Customized Partitioning
7
11.1
Introducing the File Input Format
14 Minutes
11.1
Text And Sequence File Formats
13 Minutes
11.1
Data partitioning using a custom partitioner
10 Minutes
11.1
Make the custom partitioner real in code
7 Minutes
11.1
Total Order Partitioning
10 Minutes
11.1
Input Sampling, Distribution, Partitioning and configuring these
10 Minutes
11.1
Secondary Sort
9 Minutes
Recommendation Systems using Collaborative Filtering
4
12.1
Introduction to Collaborative Filtering
15 Minutes
12.1
Friend recommendations using chained MR jobs
7 Minutes
12.1
Get common friends for every pair of users – the first MapReduce
17 Minutes
12.1
Top 10 friend recommendation for every user – the second MapReduce
15 Minutes
Hadoop as a Database
7
13.1
Structured data in Hadoop
14 Minutes
13.1
Running an SQL Select with MapReduce
14 Minutes
13.1
Running an SQL Group By with MapReduce
15 Minutes
13.1
A MapReduce Join – The Map Side
14 Minutes
13.1
A MapReduce Join – The Reduce Side
14 Minutes
13.1
A MapReduce Join – Sorting and Partitioning
13 Minutes
13.1
A MapReduce Join – Putting it all together
9 Minutes
K-Means Clustering
7
14.1
What is K-Means Clustering?
14 Minutes
14.1
A MapReduce job for K-Means Clustering
14 Minutes
14.1
K-Means Clustering – Measuring the distance between points
17 Minutes
14.1
K-Means Clustering – Custom Writables for Input/Output
14 Minutes
14.1
K-Means Clustering – Configuring the Job
8 Minutes
14.1
K-Means Clustering – The Mapper and Reducer
11 Minutes
14.1
K-Means Clustering : The Iterative MapReduce Job
11 Minutes
Setting up a Hadoop Cluster
3
15.1
Manually configuring a Hadoop cluster (Linux VMs)
4 Minutes
15.1
Getting started with Amazon Web Servicies
14 Minutes
15.1
Start a Hadoop Cluster with Cloudera Manager on AWS
6 Minutes
Appendix
2
16.1
Setup a Virtual Linux Instance (For Windows users)
13 Minutes
16.1
[For Linux/Mac OS Shell Newbies] Path and other Environment Variables
16 Minutes
Learn By Example: Hadoop, MapReduce for Big Data problems
Search
This content is protected, please
login
and enroll in the course to view this content!
Login with your site account
Lost your password?
Remember Me
Not a member yet?
Register now
Register a new account
Are you a member?
Login now
Modal title
Main Content