Big Data Tools

A data engineering plugin

Get Plugin

News Releases

Big Data Tools 2023.1 Is Out!

Anna Maltseva

Read this post in other languages:

Our latest release includes several new features and improvements based on feedback from our users. In this release, we’ve added integration with Kafka Schema Registry, Kerberos authentication, and extended support for all cloud storages in Big Data Tools. Read on to learn about the most important changes in the Big Data Tools plugin or try it right now by installing it to IntelliJ IDEA Ultimate, PyCharm Professional, DataSpell, or DataGrip 2023.1.

New: Kafka Schema Registry connection

In response to numerous requests, we have integrated the Kafka Schema Registry connection into the Big Data Tools plugin 2023.1. The Schema Registry defines the data structure and helps keep your Kafka applications synchronized with data changes. With this latest update, you can explore Kafka topics serialized in Avro or Protobuf directly from the IDE.

We also enabled a connection to Schema Registry via a secure SSH tunnel to help you consume and produce messages from your local machine.

More convenient work with cloud storages

We have aligned our feature set between all supported remote file storages (such as Amazon S3, Google Cloud Storage, and Microsoft Azure Blob Storage). Here are the most notable improvements:

You can now view and manage versions of the bucket objects right in your IDE.
If you use object or bucket tagging to categorize storage, you can now modify and view them without leaving your editor window.
The new contextual search feature lets you quickly locate a specific bucket in your cloud storage by typing relevant keywords.

Extended Hive Metastore integration

Now the plugin supports both Hive 3 and Hive 2, allowing users to preview and understand their data within their IDE. You can open the Hive Metastore or its particular catalogs, databases, and tables in a separate tab of the editor.

Improvements to Apache Zeppelin integration

If you are working with a Zeppelin notebook in your IDE, you can extract Spark job code into a Scala file to continue working on it in an IDE project. In this release, we have added options to extract both selected Scala code and code from paragraphs.

We’ve added code completion for PySpark paragraphs in Zeppelin, which completes column names by providing a list of the inferred columns from the DataFrame.

We’ve also consolidated Dependencies and Interpreter Settings into a single window, making it easier to find the necessary settings.

Kerberos authentication

It is now possible to connect to Kafka, HDFS, and Hive Metastore by using Kerberos authentication.

Single authorization in AWS services

We’ve added the ability to share AWS authorizations across AWS S3, AWS Glue, and AWS EMR connections. This eliminates the need to repeatedly enter keys or perform MFA authentication for each connection.

Viewing big data binary files

With the Big Data Tools plugin, you can conveniently preview the content of big data file formats without leaving your IDE. The latest release enables the opening of Parquet files that use compression methods such as zstd, Brotli, and LZ4.

For the full list of new features and enhancements, see the changelog on the Big Data Tools plugin page. Please share your feedback with us and report any issues you encounter to our issue tracker.

The Big Data Tools team

View the Current State of Variables in Zeppelin Notebooks

Discover more

In this update we've added integrations with AWS Glue, Tencent Cloud Object Storage, enhanced Zeppelin notebooks support, and delivered important fixes. Read on to learn about the most important changes in the Big Data Tools plugin or try it right away by installing it to the 2022.3 of IntelliJ…

The highlights of this release include integration with Hive Metastore and the ability to monitor Flink jobs right inside your IDE, as well as SSO authentication on Amazon S3. The new version provides many other noteworthy changes that are covered below. Get the latest version by installing it to…

Big Data Tools 2022.2 EAP is now available. You can try the newly added features right away by installing the latest plugin version to the 2022.2 EAP of your IDE. Please note this is an Early Access Program build, meaning it’s not fully tested. Hive Metastore support Ability to creat…

Explore the new features for data engineers in the JetBrains Big Data Tools plugin, including Amazon EMR support and streamlined work with remote file systems.

Big Data Tools

Big Data Tools 2023.1 Is Out!

New: Kafka Schema Registry connection

More convenient work with cloud storages

Extended Hive Metastore integration

Improvements to Apache Zeppelin integration

Kerberos authentication

Single authorization in AWS services

Viewing big data binary files

Discover more

Big Data Tools 2022.3: Integration with AWS Glue Data Catalog, Code Completion for SQL Expressions in Zeppelin Notebooks

Big Data Tools 2022.2 is here!

Big Data Tools 2022.2 EAP: What’s New?

Big Data Tools: Amazon EMR Support, Alibaba OSS Integration, and more

Big Data Tools

Big Data Tools 2023.1 Is Out!

New: Kafka Schema Registry connection

More convenient work with cloud storages

Extended Hive Metastore integration

Improvements to Apache Zeppelin integration

Kerberos authentication

Single authorization in AWS services

Viewing big data binary files

Subscribe to Big Data Tools Blog updates

Discover more

Big Data Tools 2022.3: Integration with AWS Glue Data Catalog, Code Completion for SQL Expressions in Zeppelin Notebooks

Big Data Tools 2022.2 is here!

Big Data Tools 2022.2 EAP: What’s New?

Big Data Tools: Amazon EMR Support, Alibaba OSS Integration, and more