Using BigQuery from IntelliJ-based IDE

Continuing the series of posts on how to connect DataGrip (or any other IntelliJ-based IDE) to various data sources, in this post we’ll show you how to connect to Google’s BigQuery. BigQuery is a low-cost enterprise data warehouse designed to handle data analytics at a massive scale.

Currently, DataGrip does not come with a built-in connector for BigQuery, so let’s connect the IDE to it.

Update: if you prefer to watch a video instead of reading this post, please see a recent version of this tutorial created in April 2019.

First, let’s get all the details needed to connect to BigQuery from the Cloud Console. Make sure you have the right project selected, then click on the Create service account button. Type the name of the account, for example “datagrip”, in the Role section, select the BigQuery | BigQuery Data View, BigQuery | BigQuery Job User, and BigQuery | BigQuery User roles, then check the Furnish a new private key box. Make sure that the Key Type is set to JSON, then click on the Create button. A credentials file will be downloaded and you can click on the Close button on the dialog.

Note: Make sure you place that file in a safe place as it will allow anything with rights to read it to connect to your BigQuery dataset you are allowed to access.

DataGrip - BigQuery - Google Cloud Console

Now, back to the IDE, we need to add a new Data Source. Go to Database Tool Window, then select Data Source Properties, click on the “+” button and then, at the bottom of the list you’ll see an entry named Driver, click on this.

Start by changing the name of the configuration to “BigQuery”. You will notice that there is a section called Driver files. You can download the BigQuery JDBC driver from this page. Download the “JDBC 4.2-compatible” zip file, named “SimbaJDBCDriverforGoogleBigQuery42_1.1.6.1006.zip” at the time of writing, then unpack it into a new directory.

Once you downloaded the driver, go back to the IDE, and in the Additional files section, click on “+” and add all the “.jar” files from the newly created directory. Under the Class drop-down, now you can select “com.simba.googlebigquery.jdbc42.Driver” from it. In the Dialect drop-down, select Generic SQL.

Next, let’s define the connection URL template. Click on the “+” button under the URL templates section and add a new entry named “default” with the value: “jdbc:bigquery://[{host::https\://www.googleapis.com/bigquery/v2}[:{port::443}]][<;,{:identifier}={:param}>]”. Go to the Advanced tab, and in the OAuthType field, type “0”(zero).

DataGrip - BigQuery - Driver Config

With the driver is configured, go to the “+” sign at the top left of the Data Sources and Drivers window and select “BigQuery”. Leave the User and Password fields empty.
If you don’t want to modify the data in BigQuery, or your user can only view the data without modifying it, then mark the connection as Read-only.

Click on the Advanced tab and add your service account to OAuthServiceAcctEmail. This is the email we have from the earlier step in the Google Cloud Console. In the OAuthPvtKeyPath type the path to the file downloaded earlier from the Cloud Console. Finally, type the project id in the ProjectId field.

Go back to the General tab, click on the Test Connection button and you should see a “Successful” message. This means you configured everything correctly and you can now connect to BigQuery.

Close the window by pressing the OK button, and now you’ll be presented with a database console ready for use.

DataGrip - BigQuery - Connection Config

Let’s run a few sample queries to see how everything. For this, we’ll use the BigQuery public dataset available, and borrow one query from the official documentation:

Press CTRL+ENTER and the query will start running.

Note: Every time you run a query against BigQuery which requires it to process data, you will pay for it. Make sure that you understand the data size and the query you are about to run before doing so. For more details about the pricing, please see the official BigQuery documentation or contact your Google Cloud administrators.

DataGrip - BigQuery - Run Query

And that’s it. Now you can use BigQuery just like any other database and have all the IDE power for completion, data export, and visualization. Please let us know in the comments below if you have any questions, or features requests, we look forward to your feedback.

About Florin Pățan

Developer Advocate at JetBrains
This entry was posted in Tutorial, Uncategorized and tagged . Bookmark the permalink.

27 Responses to Using BigQuery from IntelliJ-based IDE

  1. Ilya Onskul says:

    Hi there.

    I managed to set up connection as specified, but the completion doesn’t seem to be aware of the table structure. Is this a problem with my setup or something that needs to be added to feature list?

    Thanks

  2. Bob De Schutter says:

    Hi

    When I try to change the values of the parameters in the datasource advanced tab (like OAuthServiceAcctEmail), I can only select ‘?’ from the dropdown as value and I am unable to enter a value myself…

    What am I doing wrong?

    Using datagrip 2017.2

    Thanks
    Bob

  3. Fritz Zuhl says:

    I have made a connection to bigquery, but for the non-trivial queries, I time-out. Any ideas on how I increase the timeout value?

  4. Fritz Zuhl says:

    Never mind. I figured it out. I changed the ‘timeout’ value in datagrip.

  5. Grant says:

    Do you think we can get native support? This documentation is out of date as Google now provides the driver in a DMG format

  6. Mārtiņš says:

    Thank you!
    You can also configure connection without service account and run queries as current user.
    Just put your email as OAuthServiceAcctEmail and location of your credentials, like /Users/[user]/.config/gcloud/application_default_credentials.json as OAuthPvtKeyPath.

  7. Kipson says:

    Doesn’t work for me. Appears to be outdated. Seriously frustrating.

  8. SKod says:

    Did not work for me either. It says ClassNotFound at the Username & Password section.

  9. Thijs says:

    Outdated, better use some other client if they don’t even support Google.

  10. Olivier says:

    Hi there,
    I’ve followed the tutorial and looked at the video (the last link you sent) and I keep getting this error message:

    The specified database user/password combination is rejected: com.simba.googlebigquery.support.exceptions.GeneralException: EXEC_JOB_EXECUTION_ERR

    • Florin Pățan says:

      Please make sure that you don’t have any whitespaces around your OAuthServiceAcctEmail or paths to the file or the database name. If this still a problem, please let me know.

      • chris says:

        Hey Florin, I’ve had the same issue, checked for whitespace etc. and all fine but still getting this error.

        • Florin Pățan says:

          Without having more information on this, including the actual configuration, it’s hard to investigate why it happens.
          Have you tried to look at the video and see the steps there? I recorded the video without stopping so everything in it should still apply/work as described.
          On the other hand, you can also open an issue on our issue tracker at https://youtrack.jetbrains.com/issues/DBE, mark it as “jetbrains-team” only, and share your configuration with us? Please CC me in the issue and maybe me or my colleagues will notice anything in the configuration that’s not as it should be.

    • Bjorn says:

      Check the permissions of the Service Account that you are using. I only added Viewer permissions, because that should be enough in my opinion, but I got the same error message as you. I revisited that part and added all the permissions as mentioned in this article and now it works.

  11. Jonathan Lange says:

    Hi,

    Thanks for writing this blog post, it’s very helpful.

    It looks like you’re suggesting that everyone who needs to use DataGrip to access BigQuery should share access to the same service account using the same shared credentials. That won’t fly with our infrastructure team.

    Is there another way? Should we have one service account per user?

    Thanks,
    Jonathan

    • Florin Pățan says:

      Hi Jonathan,

      I haven’t thought about such a scenario until now but I believe your question holds the answer. In that particular case, I would say that each user has their own Service Account.

      The other option would be to have users put down the credentials in those files use their own account instead, but I haven’t tried that so far.

      From the management of access stand-point, I would personally go for the Service Account as that could be revoked, managed individually, and at a better granularity.

      I hope this helps you.

      Kind regards,
      Florin

  12. fpopic says:

    What about adding Google BigQuery Standard SQL dialect for file formatting?
    Generic dialect doesn’t support many of special BigQuery use cases.

Leave a Reply to Mārtiņš Cancel reply

Your email address will not be published. Required fields are marked *