Big Data Tools 1.0 Generally Available
A new update of the Big Data Tools plugin has been released. This is our first version for general use, after a year and a half of the Early Access Preview program.
Install the plugin from the JetBrains Plugin Repository or from inside your IDE to edit Zeppelin notebooks, upload files to cloud filesystems, and monitor Hadoop and Spark clusters. The following JetBrains IDEs support the plugin: IntelliJ IDEA Ultimate, PyCharm Professional Edition, and DataGrip.
In this release, we’ve added many useful features and addressed a variety of bugs. Let’s dive into the details.
Zeppelin 0.9 Support
The latest version of Apache Zeppelin, 0.9.0, was released last December with 568 tickets resolved, including many new features and bug fixes.
We had been preparing for that for a long time, testing the BDT plugin thoroughly against Zeppelin 0.9-preview2, so finalizing 0.9 support did not take long. Now we invite everyone to try not only the new version of Big Data Tools but also the new version of Zeppelin.
Zeppelin notebook import/export
The Big Data Tools plugin does the small routine operations for you so you don’t have to switch to the web interface too often. The obvious candidate for automatization is the importing and exporting of notebooks. Now you can save your notebook directly from the IDE to your computer and share it with your colleagues.
Zeppelin interpreter and repository settings
Pick Open Interpreter Settings from the notebook context menu to open this new window:
The screenshot shows the Markdown interpreter settings, just as an example. The markdown.parser.type parameter can take values flexmark, pegdown, or markdown4j. The ability to select flexmark is new and available in Zeppelin 0.9 only.
As you can see, the complete list of interpreters is available in this window, and you can change their settings, too.
This interface is an improved alternative for what already exists in the Zeppelin web interface. A big advantage is that you no longer need to open the browser to view or edit any setting.
Also in this window, you can reload the interpreter or edit the list of repositories:
In Zeppelin, it is possible to declare variables separately from the notebook. All such variables will be available when the interpreter is started, which can be useful for storing configuration values.
This setting looks like zeppelin.SparkInterpreter.precode. (In theory, any interpreter can be used there, but our highlighting feature only supports Spark and PySpark.) The documentation for this Zeppelin feature is available here.
Starting with this update, the Big Data Tools plugin understands code written inside precode. If you use any precode-declared variables inside the notebook, they will be perfectly highlighted, like any other variable.
To configure the precode value, use the interpreter settings window we described above:
Let’s check if the highlighting feature really works:
Running scripts before spark-submit
Now there’s an easy way to configure the environment before starting a task for execution. Right in the spark-submit settings, you can specify a string that will be executed by bash.
If you want to configure a Python environment, now you can run “source activate py36” without providing any additional scripts. You can also run echo “Hello World” or any other console command.
Improved Python support
We’ve been gradually improving Python support since introducing it last December. Here’s a new window that lets you configure Python, if you haven’t already.
You may want to select the “Install stubs for Spark built-ins” checkbox to greatly improve autocompletion in PySpark.
Improved notebook search
Now you can find notebooks using Search Everywhere, by pressing Shift Shift.
Notebooks are displayed in two ways: along with all the results in the All tab, and separately in the Zeppelin Notebooks tab.
Have you ever been in a situation where you wanted to report a bug in the plugin or get in touch with the plugin developers, but you weren’t sure how to quickly do that? Well, now there is an easy way. Simply click the Support dropdown menu in the upper right-hand corner of the Big Data Tools tool window and choose one of three ways to contact us. We’ll be happy to hear from you!
Bug fixes and improvements
We’re actively developing and enhancing the Big Data Tools plugin, doing our best to take into account all your feedback and fix as many bugs as possible.
You can upgrade to the latest version either from your browser, from the plugin page, or inside the IDE. On the plugin page, you’re very welcome to leave your feedback and suggestions. We always want to know what you think.
Documentation and social networks
Last but not least, if you’re looking for more info about how to use any feature of the plugin, make sure to check out the IntelliJ IDEA, PyCharm, or DataGrip documentation depending on which IDE you use. Still need help? Please don’t hesitate to leave us a message either here in the comments or on Twitter.
Version 1.0 is a big milestone in the history of the Big Data Tools plugin. Hopefully, all of these improvements will be useful for you, allowing you to do more exciting things and in a more enjoyable way. Thank you for using the Big Data Tools plugin!
The Big Data Tools plugin team
Subscribe to Blog updates
Thanks, we've got you!
Big Data Tools 2023.1 Is Out!
In this release, we’ve added integration with Kafka Schema Registry, Kerberos authentication, and extended support for all cloud storages in Big Data Tools.
Big Data Tools 2022.3: Integration with AWS Glue Data Catalog, Code Completion for SQL Expressions in Zeppelin Notebooks
In this update we've added integrations with AWS Glue, Tencent Cloud Object Storage, enhanced Zeppelin notebooks support, and delivered important fixes. Read on to learn about the most important changes in the Big Data Tools plugin or try it right away by installing it to the 2022.3 of IntelliJ…
Big Data Tools 2022.2 is here!
The highlights of this release include integration with Hive Metastore and the ability to monitor Flink jobs right inside your IDE, as well as SSO authentication on Amazon S3. The new version provides many other noteworthy changes that are covered below. Get the latest version by installing it to…
Big Data Tools 2022.2 EAP: What’s New?
Big Data Tools 2022.2 EAP is now available. You can try the newly added features right away by installing the latest plugin version to the 2022.2 EAP of your IDE. Please note this is an Early Access Program build, meaning it’s not fully tested. Hive Metastore support Ability to creat…