News

Introducing the PyCharm Databricks Integration

We’re introducing the Databricks integration with PyCharm Professional to make it easier for you to process, store, and analyze your data! 

The integration allows you to build your data and AI apps on the Databricks Data Intelligence Platform directly within PyCharm Professional, enhancing the data analytics platform with the powerful Python IDE by JetBrains. It enables you to write code quickly and easily and run it in the cloud without extra configurations, and it offers additional benefits for working with data. 

Read this blog post to learn more about the integration, who it will be useful for, and what benefits it offers.

Watch the plugin in action

What is Databricks?

The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data.

What is PyCharm Professional?

PyCharm Professional is a leading IDE for Python and other programming languages. It allows you to write high-quality and efficient code using superior code completion, refactoring capabilities, code inspections, seamless code and project navigation, a debugger, and a wide range of integrations, including Jupyter notebooks, testing frameworks, Git, CI/CD solutions, and more – all available in one place right out of the box.

Who will the integration be useful for? 

Organizations and data professionals using data lakehouses, data lakes, and data warehouses via Databricks will benefit from this integration.

What benefits does the integration bring?

The integration combines the most powerful capabilities of each platform, allowing you to easily build all of your data and AI applications at scale within PyCharm: 

  • Use PyCharm to implement software development best practices, which are essential for large codebases, such as source code control, modular code layouts, testing, and more. 
  • Databricks enables the use of powerful clusters, allowing you to work on projects too large for a local machine and helping you orchestrate data processing efficiently. 

You can write the code for your pipelines and jobs in PyCharm, then deploy, test, and run it in real time on your Databricks cluster without any additional configurations. 

Let’s dive into more details about what the PyCharm Databricks integration provides.

Connect to your cluster via PyCharm

You can connect directly to the Databricks cluster via PyCharm and monitor the process within the IDE. This allows you to check if the cluster is running, see the results of the current session’s runs, and view process outcomes along with additional details.

Connect to your cluster via PyCharm

Run Python scripts on a remote cluster

In addition, you can run Python scripts on a remote cluster, which is particularly useful for working with big data, and view the results in the IDE.

Run Python scripts on a remote cluster

Run Jupyter notebooks or Python scripts as workflows

Additionally, you can run your notebook or Python scripts as a Databricks workflow and see the output in the console. 

Run Jupyter notebooks or Python scripts as workflows

You can see the results of the runs on the Databricks platform, including the runs initiated from PyCharm.

See the results of the runs on the Databricks platform from PyCharm

Synchronize project files to the Databricks workspace

The synchronization of project files with the Databricks workspace allows you to access and work with the same files in both PyCharm and Databricks workspaces. You can also schedule your notebooks and scripts and utilize other platform features for projects completed in PyCharm. 

Synchronize project files to the Databricks workspace

How to get started

Make sure you have the following ready to go:

You can install the Databricks plugin either from JetBrains Marketplace or directly from within the PyCharm IDE.

Head over to the documentation to get step-by-step instructions on how to get started and use the plugin.

What do you think about this integration? Share your thoughts in the comments below.

image description