Meet Big Data Tools – Spark Integration and Zeppelin Notebooks in IntelliJ IDEA
Hooray! Today we have some exciting news for you. After all, it is not often JetBrains introduces new developer tools. Now that you’re excited as well, we’re very pleased to announce Big Data Tools – a new IntelliJ IDEA plugin that integrates Spark and brings support for editing and running Zeppelin notebooks. Now with the Big Data Tools plugin, you can create, edit, and run Zeppelin notebooks without ever having to leave your favorite IDE. The plugin offers smart navigation, code completion, inspections & quick-fixes, and refactoring inside notebooks.
So, what precisely does the plugin support now?
- Browsing, creating, and deleting notebooks
- Coding assistance for Scala paragraphs
- Code completion
- Rename refactoring
- Extract variable, method or parameter
- Go to declaration
- Show usages
- Creating and deleting paragraphs
- Running paragraphs
- Browsing paragraphs’ output
- Support for basic visualization
Note coding assistance is currently limited to the Scala code. Other languages will be coming later. Also, in the future, we plan to go beyond Zeppelin notebooks and add more general features that will streamline the developer experience for data engineers, data scientists, and other professionals that work with Big Data.
If you prefer to see it in action rather than just read about it, make sure to watch this short video:
Otherwise, go ahead and try it out for yourself! Below is a short introduction into how to get started.
How do I use the plugin?
- Make sure you’re using IntelliJ IDEA Ultimate 2019.2 (Note, it doesn’t work with 2019.3 EAP yet).
- Make sure you’ve installed the BashSupport, Python, and Scala plugins (the latest releases should do). For now, you have to install those manually – this will be improved soon.
- Install the latest build of the Big Data Tools plugin.
- If everything has worked as expected, after an IDE restart, you’ll see the tool window Big Data Tools on the right-hand side. Open it and click the ‘+’ icon on the tool window toolbar and choose Zeppelin.
- In the Zeppelin connection dialog, fill in the connection parameters for your Zeppelin instance (such as host, port, credentials, etc). Use Test Connection to check it connects.
- Once the Zeppelin configuration is set up, you’ll see the instance in the tool window along with the notebooks tree. Feel free to manage the notebooks or open them in the editor.
First and foremost, we plan to go beyond the integration of Zeppelin and Spark which, of course, we’re going to improve. We plan to bring in more integrations specific for Big Data. It includes integration with distributed file systems such as HDFS and S3, the built-in viewer of Parquet files, and better support for SQL.
Here’s a possible timeline for every feature on our roadmap:
- November: Compatibility with IntelliJ IDEA Ultimate 2019.3 EAP
- November 2019:Integration with S3 (file explorer)
- December 2019: Basic coding assistance for SQL (highlighting, completion, auto-formatting, etc)
- January 2020: Integration with HDFS (file explorer) and the Parquet viewer
- February: Basic integration with Hadoop and Spark (YARN and Spark UI, etc.).
- Spring / Summer 2020: Support for Python in Zeppelin notebooks (making the plugin compatible with PyCharm Professional Edition is also being considered, will be confirmed later).
The roadmap is not set in stone. We’re eager to adjust the roadmap according to your feedback to make sure our focus is on the right things. Please reach out to us and share your feedback – be it your opinion on how a feature must work, missing features, and any annoying bugs.
What is also important to know about the new plugin?
The Big Data Tools plugin is now only compatible with IntelliJ IDEA Ultimate. It means the plugin is not available with the Community Edition. We don’t know yet if the plugin will make it to the Community Editions. For now, we have no such plans.
The same goes for the other IDEs. We don’t know yet if the plugin will also be made compatible with other IDEs. For now, we only have a preliminary plan to extend the compatibility to PyCharm Professional Edition once the plugin gets support for Python. As the plugin matures, we may have more understanding of how we can move forward in regards to the other IDEs.
I don’t have a license for IntelliJ IDEA Ultimate. How can I try the plugin?
- If you’re an active committer to any open-source project related to Big Data, feel free to reach out to us.
- In case you’re writing about Big Data Tools, please get in touch with us and we will provide you an extended trial so you can try out the plugin.
- If you are neither a committer nor a blogger but have a keen interest in big data, please also reach out to us for an extended trial (e.g. up to 3 months) – this offer is limited.
We’re eager to hear your feedback about the plugin. First and foremost, any questions and comments are very much appreciated here as comments to this blog post. If you find any bugs or would like to suggest a feature, please submit it to our issue tracker.
If you’ve tried the plugin and would like to make bug reports, feature requests, or share your overall feedback, please make sure to fill out this 1-minute survey.
Note, your feedback is crucial to us, we’d like to make your developer experience with Spark, Zeppelin, and Big Data as enjoyable and straightforward as possible. Please help us do so by sharing your constructive feedback.
We’ve set up a Slack workspace to facilitate collaboration and feedback sharing. Join in to share your experience with plugin dev team and other users.
P.S. We’re especially glad that this announcement will accidentally take place at the same time as our team joins the Spark AI Summit. If you are by any chance at this conference, make sure to visit our booth and say hi to the team, see the plugin in action, and, of course, share your feedback.
The Drive to Develop!
The JetBrains Team