Knowledge Transfer Microsoft Certified Training Partner CTEC
Knowledge Transfer is a Microsoft Certified Gold Partner
Microsoft Certified Gold Training Partner
Search for a Course Topic:
Public Courses
Corporate Services & Training
 

 

 



 Course Search
Keyword
Course #
State

 Training Delivery
 
Training Delivery
Custom Curriculum
Course List
 
 Main Menu
 
Home
View Courses
Site Index
 
 


Designing and Building Big Data Applications Overview



  • Introduction

  • Application Architecture

    • Scenario Explanation

    • Understanding the Development Environment

    • Identifying and Collecting Input Data

    • Selecting Tools for Data Processing and Analysis

    • Presenting Results to the Use



  • Defining and Using Data Sets

    • Metadata Management

    • What is Apache Avro?

    • Avro Schemas

    • Avro Schema Evolution

    • Selecting a File Format

    • Performance Considerations



  • Using the Kite SDK Data Module

    • What is the Kite SDK?

    • Fundamental Data Module Concepts

    • Creating New Data Sets Using the Kite SDK

    • Loading, Accessing, and Deleting a Data Set



  • Importing Relational Data with Apache Sqoop

    • What is Apache Sqoop?

    • Basic Imports

    • Limiting Results

    • Improving Sqoop’s Performance

    • Sqoop 2



  • Capturing Data with Apache Flume

    • What is Apache Flume?

    • Basic Flume Architecture

    • Flume Sources

    • Flume Sinks

    • Flume Configuration

    • Logging Application Events to Hadoop



  • Developing Custom Flume Components

    • Flume Data Flow and Common Extension Points

    • Custom Flume Sources

    • Developing a Flume Pollable Source

    • Developing a Flume Event-Driven Source

    • Custom Flume Interceptors

    • Developing a Header-Modifying Flume Interceptor

    • Developing a Filtering Flume Interceptor

    • Writing Avro Objects with a Custom Flume Interceptor



  • Managing Workflows with Apache Oozie

    • The Need for Workflow Management

    • What is Apache Oozie?

    • Defining an Oozie Workflow

    • Validation, Packaging, and Deployment

    • Running and Tracking Workflows Using the CLI

    • Hue UI for Oozie



  • Processing Data Pipelines with Apache Crunch

    • What is Apache Crunch?

    • Understanding the Crunch Pipeline

    • Comparing Crunch to Java MapReduce

    • Working with Crunch Projects

    • Reading and Writing Data in Crunch

    • Data Collection API Functions

    • Utility Classes in the Crunch API



  • Working with Tables in Apache Hive

    • What is Apache Hive?

    • Accessing Hive

    • Basic Query Syntax

    • Creating and Populating Hive Tables

    • How Hive Reads Data

    • Using the RegexSerDe in Hive



  • Developing User-Defined Functions

    • What are User-Defined Functions?

    • Implementing a User-Defined Function

    • Deploying Custom Libraries in Hive

    • Registering a User-Defined Function in Hive



  • Executing Interactive Queries with Impala

    • What is Impala?

    • Comparing Hive to Impala

    • Running Queries in Impala

    • Support for User-Defined Functions

    • Data and Metadata Management



  • Understanding Cloudera Search

    • What is Cloudera Search?

    • Search Architecture

    • Supported Document Formats



  • Indexing Data with Cloudera Search

    • Collection and Schema Management

    • Morphlines

    • Indexing Data in Batch Mode

    • Indexing Data in Near Real Time



  • Presenting Results to Users

    • Solr Query Syntax

    • Building a Search UI with Hue

    • Accessing Impala through JDBC

    • Powering a Custom Web Application with Impala and Search




 

View Printer Friendly Page


To Inquire About Future Classes

Request a class date

if one is not scheduled.