Meet Big Data Tools – Spark Integration and Zeppelin Notebooks in IntelliJ IDEA

Andrey Cheptsov

Hooray! Today we have some exciting news for you. After all, it is not often JetBrains introduces new developer tools. Now that you’re excited as well, we’re very pleased to announce Big Data Tools – a new IntelliJ IDEA plugin that integrates Spark and brings support for editing and running Zeppelin notebooks. Now with the Big Data Tools plugin, you can create, edit, and run Zeppelin notebooks without ever having to leave your favorite IDE. The plugin offers smart navigation, code completion, inspections & quick-fixes, and refactoring inside notebooks.

So, what precisely does the plugin support now?

  • Browsing, creating, and deleting notebooks
  • Coding assistance for Scala paragraphs
    • Code completion
    • Rename refactoring
    • Extract variable, method or parameter
    • Go to declaration
    • Show usages
  • Creating and deleting paragraphs
  • Running paragraphs
  • Browsing paragraphs’ output
  • Support for basic visualization

Note coding assistance is currently limited to the Scala code. Other languages will be coming later. Also, in the future, we plan to go beyond Zeppelin notebooks and add more general features that will streamline the developer experience for data engineers, data scientists, and other professionals that work with Big Data.

If you prefer to see it in action rather than just read about it, make sure to watch this short video:

https://youtu.be/YhhPcdmMHao

Otherwise, go ahead and try it out for yourself! Below is a short introduction into how to get started.

How do I use the plugin?

  • Make sure you’re using IntelliJ IDEA Ultimate 2019.2 (Note, it doesn’t work with 2019.3 EAP yet).
  • Make sure you’ve installed the BashSupport, Python, and Scala plugins (the latest releases should do). For now, you have to install those manually – this will be improved soon.
  • Install the latest build of the Big Data Tools plugin.
  • If everything has worked as expected, after an IDE restart, you’ll see the tool window Big Data Tools on the right-hand side. Open it and click the ‘+’ icon on the tool window toolbar and choose Zeppelin.
  • In the Zeppelin connection dialog, fill in the connection parameters for your Zeppelin instance (such as host, port, credentials, etc). Use Test Connection to check it connects.
  • Once the Zeppelin configuration is set up, you’ll see the instance in the tool window along with the notebooks tree. Feel free to manage the notebooks or open them in the editor.

What’s planned?

First and foremost, we plan to go beyond the integration of Zeppelin and Spark which, of course, we’re going to improve. We plan to bring in more integrations specific for Big Data. It includes integration with distributed file systems such as HDFS and S3, the built-in viewer of Parquet files, and better support for SQL.

Here’s a possible timeline for every feature on our roadmap:

  • November: Compatibility with IntelliJ IDEA Ultimate 2019.3 EAP
  • November 2019:Integration with S3 (file explorer)
  • December 2019: Basic coding assistance for SQL (highlighting, completion, auto-formatting, etc)
  • January 2020: Integration with HDFS (file explorer) and the Parquet viewer
  • February: Basic integration with Hadoop and Spark (YARN and Spark UI, etc.).
  • Spring / Summer 2020: Support for Python in Zeppelin notebooks (making the plugin compatible with PyCharm Professional Edition is also being considered, will be confirmed later).

The roadmap is not set in stone. We’re eager to adjust the roadmap according to your feedback to make sure our focus is on the right things. Please reach out to us and share your feedback – be it your opinion on how a feature must work, missing features, and any annoying bugs.

What is also important to know about the new plugin?

The Big Data Tools plugin is now only compatible with IntelliJ IDEA Ultimate. It means the plugin is not available with the Community Edition. We don’t know yet if the plugin will make it to the Community Editions. For now, we have no such plans.

The same goes for the other IDEs. We don’t know yet if the plugin will also be made compatible with other IDEs. For now, we only have a preliminary plan to extend the compatibility to PyCharm Professional Edition once the plugin gets support for Python. As the plugin matures, we may have more understanding of how we can move forward in regards to the other IDEs.

I don’t have a license for IntelliJ IDEA Ultimate. How can I try the plugin?

  • If you’re an active committer to any open-source project related to Big Data, feel free to reach out to us.
  • In case you’re writing about Big Data Tools, please get in touch with us and we will provide you an extended trial so you can try out the plugin.
  • If you are neither a committer nor a blogger but have a keen interest in big data, please also reach out to us for an extended trial (e.g. up to 3 months) – this offer is limited.

Feedback

We’re eager to hear your feedback about the plugin. First and foremost, any questions and comments are very much appreciated here as comments to this blog post. If you find any bugs or would like to suggest a feature, please submit it to our issue tracker.

If you’ve tried the plugin and would like to make bug reports, feature requests, or share your overall feedback, please make sure to fill out this 1-minute survey.

Note, your feedback is crucial to us, we’d like to make your developer experience with Spark, Zeppelin, and Big Data as enjoyable and straightforward as possible. Please help us do so by sharing your constructive feedback.

Slack community

We’ve set up a Slack workspace to facilitate collaboration and feedback sharing. Join in to share your experience with plugin dev team and other users.

P.S. We’re especially glad that this announcement will accidentally take place at the same time as our team joins the Spark AI Summit. If you are by any chance at this conference, make sure to visit our booth and say hi to the team, see the plugin in action, and, of course, share your feedback.

The Drive to Develop!
The JetBrains Team

Comments below can no longer be edited.

12 Responses to Meet Big Data Tools – Spark Integration and Zeppelin Notebooks in IntelliJ IDEA

  1. Gevorg says:

    October 17, 2019

    When I try to connect to Zeppelin server I have this error messag:
    Zeppelin on is unreachable. Please make sure that you have entered address and port correctly.
    Current Zeppelin server is OK

    • Andrey Cheptsov says:

      October 17, 2019

      Hello, which version your Zeppelin is? Is it 0.7.x or 0.8.x?

    • Mikhail says:

      October 17, 2019

      I have the same issue, using zeppelin 0.7.3 (deployed through EMR)

  2. Lewinma says:

    October 24, 2019

    Awesome work!
    And I must say that if I can see the source code? Since I need a flink support too.

  3. Aleck says:

    October 31, 2019

    Works perfectly! A great tool. Thank you guys. Please continue doing the Big Data tools and consider going into Tableau/PowerBI area as well, $$$.

  4. raveena says:

    February 14, 2020

    Thanks for the this good information
    Spark and Scala Online Training

  5. pokerdragon says:

    May 16, 2020

    Using Intellij 2020.1.1 CE, downloaded the plugin zip (0.2.587) and installed from disk. But the plugin says it requires “com.intellij.modules.ssh” plugin to be installed. Where can I find this plugin? Thanks!

    • pokerdragon says:

      May 16, 2020

      Sorry, I missed the part that this plugin is for Ultimate version only…

      • Andrey Cheptsov says:

        May 16, 2020

        Yes, just wanted to make this comment. Please try Ultimate!