How to Read From MySQL Database to pandas With Datalore
Reading data from a MySQL database to a pandas dataframe can be intimidating. Establishing a connection, keeping the credentials safe, creating an SQL query within a string variable, and saving the result to pandas is not a trivial task.
Luckily, you can automate this workflow with Datalore, a collaborative data science platform from JetBrains that offers a variety of data integrations, including MySQL.
Read on to learn how you can seamlessly transition from SQL to Python when reading data from a MySQL database.
Connect a MySQL database to Datalore
To connect a new MySQL database, open a new notebook and open Attached data.
Create a new database connection and specify your MySQL database host, port, database name, username, and password. Then, you can test and save the connection.
The cool thing about database connections in Datalore is that your credentials are not stored in the notebook environment, as they are handled by a separate database connection server. This means that when you collaborate with others on notebooks, they won’t be able to access your credentials.
Database connections and SQL cells are available for free for Community users with a 30-day trial. Professional and Enterprise users get unlimited access to these features.
The database you want to connect to might be hidden behind your company’s firewall.
To connect to it, there are 3 options:
- For Datalore Community and Professional accounts, you can whitelist Datalore’s IP address: “126.96.36.199”.
- Connect using SSH tunneling.
- Ask your company to install Datalore Enterprise in a private cloud or on-premises. This way, Datalore will be inside the firewall and the data won’t leave your company’s environment.
Learn more about Datalore Enterprise
Browse MySQL database schema
After testing a database connection, you can choose the schemas you want to work with. This could be the 1 the or 2 tables that you need most in your work.
After saving the connection, you will see it in Datalore’s Attached data tab. In Datalore, you can create as many database connections as necessary and query the data from several databases in 1 notebook.
It will take a few seconds for Datalore to introspect the database tables. Introspections will further enable correct code completion for column and table names. Then, you can click on the database item and explore the schema.
Create an SQL query and read query result to pandas
To create an SQL query, click New SQL cell on the database item. Then, you can start typing SQL code. Datalore will offer code completion depending on the SQL dialect and the introspection results.
The result of the query is automatically saved to a pandas dataframe, where you can continue seamlessly working on the query results with Python. In this query, we read 5,000,000 rows into the dataframe.
Datalore also brings analytical tools such as the Statistics and Visualize tabs, which allow you to quickly explore the data without writing a single line of code.
- Ad hoc reporting. With Datalore, you can make ad hoc reports from MySQL database results. With the Report builder, you can arrange the notebook cells on a canvas and share results with stakeholders as a Static or Interactive Report.
- SQL query parameterization. You can parameterize SQL queries with String, List, Integer, Float, and Boolean Python variables. Just specify your parameters in curly brackets and Datalore will run a parameterized query for you!
- Sharing database connections with your team. In Datalore, you can create database connections both in your private Home workspace and in Shared team workspaces.
By creating a connection once, workspace members will be able to use it in workspace notebooks. This can facilitate onboarding new members and help you avoid writing hundreds of lines of boilerplate code.
We hope that the database integrations in Datalore will make reading data from MySQL databases to pandas dataframes easier and allow you to focus on data exploration.
Apart from MySQL databases, Datalore supports PostgreSQL, Amazon Redshift, Snowflake, BigQuery, MariaDB, and more.
If you want to use the traditional way to query databases purely with Python, read this article.
The Datalore team
Subscribe to Blog updates
Thanks, we've got you!