This course introduces you to key best practices related to Hive in Qubole by familiarizing you with Data Preparation and Optimization options to consider when working with Hive within the Qubole platform.
Estimated time to complete this course: 30 mins.
Hive Data Preparation
Because Hive is file agnostic, Qubole can read and write files of various structures including text-based, columnar and binary. In this lesson you’ll learn the following concepts to consider when preparing your data for Hive optimization in Qubole.
- File Types
- Columnar File Formats
- Hive Partitioning
- Managing & Calculating Split Size
- ORC Optimizations
- Hive Vectorization
- File Compression
- Common Data Load Failure Scenarios
- Reducer Optimization
Hive Environment Optimization
In this lesson you’ll learn concepts related to using Hive Server 2 in Qubole.
- Memory Management
- Tez Application Master Sessions
- API Commands
- Tez Optimization
- Application Master & Container Memory
- Memory Failure: Java Heap Space
- Hive Bootstrap
- Common Environment Failure Scenarios