Knowledge Transfer Microsoft Certified Training Partner CTEC
Knowledge Transfer is a Microsoft Certified Gold Partner
Microsoft Certified Gold Training Partner
Search for a Course Topic:
Public Courses
Corporate Services & Training
 

 

 



 Course Search
Keyword
Course #
State

 Training Delivery
 
Training Delivery
Custom Curriculum
Course List
 
 Main Menu
 
Home
View Courses
Site Index
 
 


Hadoop Administration Overview


1. Introduction to Hadoop

  • What is Hadoop? Who are the major vendors?
  • A dive into the Hadoop Ecosystem
  • Benefits of using Hadoop
  • How to use Hadoop within your infrastructure?
  • Where do we use Hadoop?
  • Where do we look at options besides Hadoop?

2. Introduction to MapReduce

  • What is MapReduce?
  • Why do you need MapReduce?
  • How to use MapReduce in Hadoop?
  • Lab: How does it work from languages like Java?
  • How does it work with languages like Ruby?

3. Introduction to Yarn

  • What is Yarn?
  • What are the advantages of using Yarn over classical MapReduce?
  • How to use Yarn within Hadoop?
  • Lab: How does it work from languages like Java?
  • How does it work with languages like Ruby?

4. Introduction to HDFS

  • What is HDFS?
  • Why do you need a distributed file system?
  • How is a distributed file system different from a traditional file system?
  • What is unique about HDFS when compared to other file systems?
  • Is HDFS reliable?
  • Does it offer support for compressions, checksums and data integrity?
  • Overview of HDFS commands
  • Lab: Standard file system commands
  • Lab: Moving data to and from HDFSs

5. Hadoop Deployment and Administration

  • Best Practices for Hadoop Cluster Hardware and Software
  • Basic Hadoop Operations
  • Hadoop deployment options - which one to choose?
  • Can I run Hadoop on a single machine?
  • Do I need to build my own cluster or can i run it on Amazon EC2?

6. Installing Hadoop

  • Lab: Installing Hadoop using Ambari
  • Installing Hadoop manually
  • Which Hadoop distribution to choose?
  • Do I use the Apache version or do I use an enterprise distribution like Hortonworks or Cloudera?
  • Benchmarking Hadoop
  • Lab: Is there a way to test the efficiency of my Hadoop installation?
  • Is it possible to tune my hadoop installation?

7. User Accounts in Hadoop

  • Creating a multi-user environment in Hadoop
  • Can I use existing kerberos authentication with Hadoop?
  • Can various groups within my company isolate their data from each other?
  • What are the various groups and users in a Hadoop ecosystem?
  • Lab: How do I use Hue to configure account?

8. Logs and Configuration

  • Lab: Understanding logs and directory structures in Hadoop
  • Where does Hadoop store its logs?
  • Can I access the logs through the web?
  • How do I debug my Hadoop installation for problems?
  • Lab: Understanding configuration files
  • How do I manually configure Hadoop?
  • Where are the various files located?

9.Monitoring the cluster

  • Starting and restarting hadoop services
  • Is it possible to gracefully restart a cluster?
  • What happens to any existing jobs that were running when the cluster goes down?
  • Monitoring various metrics on Hadoop
  • Tracking the status of jobs
  • Lab: How do I find information about the individual pieces that make up my application?
  • Is it possible to find information about which node my job is running?
  • Monitoring the cluster with Nagios and Ganglia?

10. Resource sharing in Hadoop

  • Understanding Schedulers in Apache Hadoop
  • How does hadoop share resources in a cluster?
  • How do I configure my cluster for equitable access?
  • Creating resource queues
  • Capacity Scheduling with Hadoop
  • What is capacity scheduling?
  • Lab: How do I use capacity scheduling within my organization?

11. How safe is my data in Hadoop?

  • Data Integrity with Apache Hadoop
  • Do I need RAID or does Hadoop provide other mechanisms for data integrity?
  • NameNode Backup and Recovery
  • Do i need to backup my Hadoop cluster?
  • Hadoop Security

 

 

View Printer Friendly Page


To Inquire About Future Classes

Request a class date

if one is not scheduled.



Comments on the Course

If you would like to have KTCS deliver this class for your organization please call us at 866-444-6548