Hive for Data Ops

Hive for Data Ops

The objective of this course is to familiarize you with the many factors to consider when configuring Hive for use with Qubole.

ABOUT THIS COURSE

LEARNING FORMAT:

Self-paced

DESCRIPTION:

This course introduces you to key best practices related to Hive in Qubole by familiarizing you with Data Preparation and Optimization options to consider when working with Hive within the Qubole platform.

Estimated time to complete this course: 30 mins.

Recommended Prerequisites:

Hive Data Preparation

Because Hive is file agnostic, Qubole can read and write files of various structures including text-based, columnar and binary. In this lesson you’ll learn the following concepts to consider when preparing your data for Hive optimization in Qubole.

  • File Types
  • Columnar File Formats
  • Hive Partitioning
  • Managing & Calculating Split Size
  • ORC Optimizations
  • Hive Vectorization
  • File Compression
  • Common Data Load Failure Scenarios
  • Reducer Optimization

Hive Environment Optimization

In this lesson you’ll learn concepts related to using Hive Server 2 in Qubole.

  • Memory Management
  • Tez Application Master Sessions
  • API Commands
  • Tez Optimization
  • Application Master & Container Memory
  • Memory Failure: Java Heap Space
  • Hive Bootstrap
  • Common Environment Failure Scenarios

 

Curriculum

  • Course Introduction
  • Course Terminology
  • Hive Data Preparation and Ingestion
  • Hive File Types and Compression
  • Hive Environment Optimization
  • HiveServer2 Management
  • Troubleshooting Failure Scenarios
  • Course Conclusion

ABOUT THIS COURSE

LEARNING FORMAT:

Self-paced

DESCRIPTION:

This course introduces you to key best practices related to Hive in Qubole by familiarizing you with Data Preparation and Optimization options to consider when working with Hive within the Qubole platform.

Estimated time to complete this course: 30 mins.

Recommended Prerequisites:

Hive Data Preparation

Because Hive is file agnostic, Qubole can read and write files of various structures including text-based, columnar and binary. In this lesson you’ll learn the following concepts to consider when preparing your data for Hive optimization in Qubole.

  • File Types
  • Columnar File Formats
  • Hive Partitioning
  • Managing & Calculating Split Size
  • ORC Optimizations
  • Hive Vectorization
  • File Compression
  • Common Data Load Failure Scenarios
  • Reducer Optimization

Hive Environment Optimization

In this lesson you’ll learn concepts related to using Hive Server 2 in Qubole.

  • Memory Management
  • Tez Application Master Sessions
  • API Commands
  • Tez Optimization
  • Application Master & Container Memory
  • Memory Failure: Java Heap Space
  • Hive Bootstrap
  • Common Environment Failure Scenarios

 

Curriculum

  • Course Introduction
  • Course Terminology
  • Hive Data Preparation and Ingestion
  • Hive File Types and Compression
  • Hive Environment Optimization
  • HiveServer2 Management
  • Troubleshooting Failure Scenarios
  • Course Conclusion