Spark for Data Scientists

Spark for Data Scientists

This course is designed to familiarize you with Spark functionality in Qubole.

ABOUT THIS COURSE

LEARNING FORMAT:

Self-paced

DESCRIPTION:

This course introduces you to key best practices related to Spark Notebooks & Tuning in Qubole. By leveraging the features provided by Spark, you’ll help your enterprise lower costs and increase the productivity of your data teams.

Estimated time to complete this course: 30 mins.

Recommended Prerequisites:

Spark Notebooks

Notebooks are often used by Data Scientists because they are convenient for quick exploration tasks. Once set up, a Notebook provides a convenient way to save, share and re-run a set of queries on a data source. In this lesson you'll learn about:

  • The Spark Toolkit
  • Notebook Features & Permissions
  • Using Packages
  • Deep Learning in Qubole
  • Integrating Jars
  • Notebook API
  • Dashboards

Notebook Tuning

In this section you’ll learn the following key concepts for tuning Notebooks in Qubole.

  • Notebook Interpreters
  • Cache Management
  • Data Format
  • Garbage Collection
  • Resource Manager & Notebook Troubleshooting
  • Notebook Logs

Recommended Follow Up:

Course Version & Product Release

This course is based on Release 50 - to see the latest updates to Qubole please refer to the release notes in our documentation:

What's New In Qubole Release 52: http://docs.qubole.com/en/latest/release-notes/releasenotesR52/index.html 

Curriculum

  • Course Introduction
  • Course Terminology
  • Spark Notebooks Overview
  • Spark Notebook Features
  • Spark Notebook Dashboards
  • Basic Notebook Tuning
  • Course Conclusion

ABOUT THIS COURSE

LEARNING FORMAT:

Self-paced

DESCRIPTION:

This course introduces you to key best practices related to Spark Notebooks & Tuning in Qubole. By leveraging the features provided by Spark, you’ll help your enterprise lower costs and increase the productivity of your data teams.

Estimated time to complete this course: 30 mins.

Recommended Prerequisites:

Spark Notebooks

Notebooks are often used by Data Scientists because they are convenient for quick exploration tasks. Once set up, a Notebook provides a convenient way to save, share and re-run a set of queries on a data source. In this lesson you'll learn about:

  • The Spark Toolkit
  • Notebook Features & Permissions
  • Using Packages
  • Deep Learning in Qubole
  • Integrating Jars
  • Notebook API
  • Dashboards

Notebook Tuning

In this section you’ll learn the following key concepts for tuning Notebooks in Qubole.

  • Notebook Interpreters
  • Cache Management
  • Data Format
  • Garbage Collection
  • Resource Manager & Notebook Troubleshooting
  • Notebook Logs

Recommended Follow Up:

Course Version & Product Release

This course is based on Release 50 - to see the latest updates to Qubole please refer to the release notes in our documentation:

What's New In Qubole Release 52: http://docs.qubole.com/en/latest/release-notes/releasenotesR52/index.html 

Curriculum

  • Course Introduction
  • Course Terminology
  • Spark Notebooks Overview
  • Spark Notebook Features
  • Spark Notebook Dashboards
  • Basic Notebook Tuning
  • Course Conclusion