Using BigQuery from IntelliJ-based IDE

Posted on by Florin Pățan

Continuing the series of posts on how to connect DataGrip (or any other IntelliJ-based IDE) to various data sources, in this post we’ll show you how to connect to Google’s BigQuery. BigQuery is a low-cost enterprise data warehouse designed to handle data analytics at a massive scale.

Currently, DataGrip does not come with a built-in connector for BigQuery, so let’s connect the IDE to it.

Update: if you prefer to watch a video instead of reading this post, please see a recent version of this tutorial created in April 2019.

https://youtu.be/r9l2c_aQPoQ

First, let’s get all the details needed to connect to BigQuery from the Cloud Console. Make sure you have the right project selected, then click on the Create service account button. Type the name of the account, for example “datagrip”, in the Role section, select the BigQuery | BigQuery Data View, BigQuery | BigQuery Job User, and BigQuery | BigQuery User roles, then check the Furnish a new private key box. Make sure that the Key Type is set to JSON, then click on the Create button. A credentials file will be downloaded and you can click on the Close button on the dialog.

Note: Make sure you place that file in a safe place as it will allow anything with rights to read it to connect to your BigQuery dataset you are allowed to access.

DataGrip - BigQuery - Google Cloud Console

Now, back to the IDE, we need to add a new Data Source. Go to Database Tool Window, then select Data Source Properties, click on the “+” button and then, at the bottom of the list you’ll see an entry named Driver, click on this.

Start by changing the name of the configuration to “BigQuery”. You will notice that there is a section called Driver files. You can download the BigQuery JDBC driver from this page. Download the “JDBC 4.2-compatible” zip file, named “SimbaJDBCDriverforGoogleBigQuery42_1.1.6.1006.zip” at the time of writing, then unpack it into a new directory.

Once you downloaded the driver, go back to the IDE, and in the Additional files section, click on “+” and add all the “.jar” files from the newly created directory. Under the Class drop-down, now you can select “com.simba.googlebigquery.jdbc42.Driver” from it. In the Dialect drop-down, select Generic SQL.

Next, let’s define the connection URL template. Click on the “+” button under the URL templates section and add a new entry named “default” with the value: “jdbc:bigquery://[{host::https\://www.googleapis.com/bigquery/v2}[:{port::443}]][<;,{:identifier}={:param}>]”. Go to the Advanced tab, and in the OAuthType field, type “0”(zero).

DataGrip - BigQuery - Driver Config

With the driver is configured, go to the “+” sign at the top left of the Data Sources and Drivers window and select “BigQuery”. Leave the User and Password fields empty.
If you don’t want to modify the data in BigQuery, or your user can only view the data without modifying it, then mark the connection as Read-only.

Click on the Advanced tab and add your service account to OAuthServiceAcctEmail. This is the email we have from the earlier step in the Google Cloud Console. In the OAuthPvtKeyPath type the path to the file downloaded earlier from the Cloud Console. Finally, type the project id in the ProjectId field.

Go back to the General tab, click on the Test Connection button and you should see a “Successful” message. This means you configured everything correctly and you can now connect to BigQuery.

Close the window by pressing the OK button, and now you’ll be presented with a database console ready for use.

DataGrip - BigQuery - Connection Config

Let’s run a few sample queries to see how everything. For this, we’ll use the BigQuery public dataset available, and borrow one query from the official documentation:

SELECT corpus FROM {{EJS0}} GROUP BY corpus;

Press CTRL+ENTER and the query will start running.

Note: Every time you run a query against BigQuery which requires it to process data, you will pay for it. Make sure that you understand the data size and the query you are about to run before doing so. For more details about the pricing, please see the official BigQuery documentation or contact your Google Cloud administrators.

DataGrip - BigQuery - Run Query

And that’s it. Now you can use BigQuery just like any other database and have all the IDE power for completion, data export, and visualization. Please let us know in the comments below if you have any questions, or features requests, we look forward to your feedback.

Comments below can no longer be edited.

29 Responses to Using BigQuery from IntelliJ-based IDE

  1. Ilya Onskul says:

    August 2, 2018

    Hi there.

    I managed to set up connection as specified, but the completion doesn’t seem to be aware of the table structure. Is this a problem with my setup or something that needs to be added to feature list?

    Thanks

    • Maksim Sobolevskiy says:

      August 3, 2018

      Is current schema selected in the drop-down of the console? A screenshot will be useful 🙂

      • Rob says:

        May 14, 2019

        I have the same issue. Hopefully the image below shows the same issue as Ilya.

        https://imgur.com/8mv8G9B

        • Linden says:

          June 15, 2020

          I also have this issue, are there any updates on how to fix? It makes using Datagrip as a complete IDE solution for BQ not workable, since you need to switch back to the UI or some other tool to view the schema all the time.

          • Linden says:

            June 15, 2020

            Playing around I may have found a solution to this. Right clicking on the database name in the schema window and going to Database Tools > Manage Shown Schemas brings up a dialogue box showing the database schema. Highlighting the desired dataset and clicking the icon in the top righthand corner to “Mark as current”, then hitting enter, caused my schema to appear. After some time refreshing all the tables appeared, although the column schema within each table is not shown.

  2. Bob De Schutter says:

    August 23, 2018

    Hi

    When I try to change the values of the parameters in the datasource advanced tab (like OAuthServiceAcctEmail), I can only select ‘?’ from the dropdown as value and I am unable to enter a value myself…

    What am I doing wrong?

    Using datagrip 2017.2

    Thanks
    Bob

    • Maksim Sobolevskiy says:

      August 23, 2018

      What driver do you use?

  3. Fritz Zuhl says:

    September 12, 2018

    I have made a connection to bigquery, but for the non-trivial queries, I time-out. Any ideas on how I increase the timeout value?

  4. Fritz Zuhl says:

    September 12, 2018

    Never mind. I figured it out. I changed the ‘timeout’ value in datagrip.

  5. Grant says:

    October 22, 2018

    Do you think we can get native support? This documentation is out of date as Google now provides the driver in a DMG format

  6. Mārtiņš says:

    October 25, 2018

    Thank you!
    You can also configure connection without service account and run queries as current user.
    Just put your email as OAuthServiceAcctEmail and location of your credentials, like /Users/[user]/.config/gcloud/application_default_credentials.json as OAuthPvtKeyPath.

  7. Kipson says:

    February 26, 2019

    Doesn’t work for me. Appears to be outdated. Seriously frustrating.

    • Florin Pățan says:

      May 15, 2019

      Please watch the video linked above in the article, https://www.youtube.com/watch?v=r9l2c_aQPoQ The UI may have changed a bit but the client still works. If you still have problems, please let us know.

  8. SKod says:

    March 24, 2019

    Did not work for me either. It says ClassNotFound at the Username & Password section.

    • Florin Pățan says:

      May 15, 2019

      Please watch the video linked above in the article, https://www.youtube.com/watch?v=r9l2c_aQPoQ The UI may have changed a bit but the client still works. If you still have problems, please let us know.

  9. Thijs says:

    April 23, 2019

    Outdated, better use some other client if they don’t even support Google.

    • Florin Pățan says:

      May 15, 2019

      Please watch the video linked above in the article, https://www.youtube.com/watch?v=r9l2c_aQPoQ The UI may have changed a bit but the client still works. If you still have problems, please let us know.

  10. Olivier says:

    July 5, 2019

    Hi there,
    I’ve followed the tutorial and looked at the video (the last link you sent) and I keep getting this error message:

    The specified database user/password combination is rejected: com.simba.googlebigquery.support.exceptions.GeneralException: EXEC_JOB_EXECUTION_ERR

    • Florin Pățan says:

      July 11, 2019

      Please make sure that you don’t have any whitespaces around your OAuthServiceAcctEmail or paths to the file or the database name. If this still a problem, please let me know.

      • chris says:

        August 1, 2019

        Hey Florin, I’ve had the same issue, checked for whitespace etc. and all fine but still getting this error.

        • Florin Pățan says:

          August 1, 2019

          Without having more information on this, including the actual configuration, it’s hard to investigate why it happens.
          Have you tried to look at the video and see the steps there? I recorded the video without stopping so everything in it should still apply/work as described.
          On the other hand, you can also open an issue on our issue tracker at https://youtrack.jetbrains.com/issues/DBE, mark it as “jetbrains-team” only, and share your configuration with us? Please CC me in the issue and maybe me or my colleagues will notice anything in the configuration that’s not as it should be.

          • Olivier says:

            December 4, 2019

            I now get the following error:

            “Connection to BigQuery failed.
            [HY000][100004] [Simba][BigQueryJDBCDriver](100004) HttpTransport IO error : sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target.”

            I have then downloaded the root certificate for https://www.googleapis.com/bigquery/v2:443 and added it to the cacerts file as it’s the general fix for this type of error (https://stackoverflow.com/questions/21076179/pkix-path-building-failed-and-unable-to-find-valid-certification-path-to-requ) but I still get the error message.

            I’m really stuck here and not being able to connect DataGrip to BigQuery is a blocker for me to purchase the licence.

            Can you please advise?

            Thanks,
            Olivier

    • Bjorn says:

      August 21, 2019

      Check the permissions of the Service Account that you are using. I only added Viewer permissions, because that should be enough in my opinion, but I got the same error message as you. I revisited that part and added all the permissions as mentioned in this article and now it works.

      • Olivier says:

        December 4, 2019

        Yeah I’ve given all the access levels I could to that service account in GCP’s IAM.

  11. Jonathan Lange says:

    October 2, 2019

    Hi,

    Thanks for writing this blog post, it’s very helpful.

    It looks like you’re suggesting that everyone who needs to use DataGrip to access BigQuery should share access to the same service account using the same shared credentials. That won’t fly with our infrastructure team.

    Is there another way? Should we have one service account per user?

    Thanks,
    Jonathan

    • Florin Pățan says:

      October 2, 2019

      Hi Jonathan,

      I haven’t thought about such a scenario until now but I believe your question holds the answer. In that particular case, I would say that each user has their own Service Account.

      The other option would be to have users put down the credentials in those files use their own account instead, but I haven’t tried that so far.

      From the management of access stand-point, I would personally go for the Service Account as that could be revoked, managed individually, and at a better granularity.

      I hope this helps you.

      Kind regards,
      Florin

  12. fpopic says:

    October 27, 2019

    What about adding Google BigQuery Standard SQL dialect for file formatting?
    Generic dialect doesn’t support many of special BigQuery use cases.