Hive Resources Best Practices
Hive Environment Best Practices
Hive Dynamic Partitioning
Hive for Data Engineers
This course is designed to help you understand how to optimize Hive features when working with Qubole.
The objective of this course is to familiarize you with Optimizing Hive Joins, Hive Tuning and Dynamic Partitioning.
Estimated time to complete this course: 30 minutes.
Optimizing Hive Joins
Configuring key Hive Join options will help optimize your workflow. In this lesson you'll learn about:
- Map Joins
- Outer Joins
- Bucket Joins
- Skew Joins
There are many ways to tune Hive for maximizing the efficiency of your queries. In this lesson you'll learn how about:
- Hive Aggregation
- Reducer optimization
- Hive User Defined Functions
- Hive Sessions
- Hive Storage Handlers
There are several ways you can use Dynamic Partitioning to improve query performance. This section provides recommended best practices for the following items:
- Command Execution
- Many Small Files
- Hive File Output Behavior
- Utilizing Cluster Resources
- Entire System Scan
- Configuring Dynamic Partitioning
- Tex Split Calculation & Application Master
- Final Output Format
- Transitioning From a Database to Hive
- File Compression
Recommended Follow Up:
Course Version & Product Release
This course is based on Release 54 - to see the latest updates to Qubole please refer to the release notes in our documentation.