Importing data from a MySQL database into a Pandas data frame including column names
110,178
IMO it would be much more efficient to use pandas for reading data from your MySQL server:
from sqlalchemy import create_engine
import pandas as pd
db_connection_str = 'mysql+pymysql://mysql_user:mysql_password@mysql_host/mysql_db'
db_connection = create_engine(db_connection_str)
df = pd.read_sql('SELECT * FROM table_name', con=db_connection)
this should also take care of column names...
Related videos on Youtube
Author by
vFlav
Updated on July 08, 2022Comments
-
vFlav almost 2 years
I am importing data from a MySQL database into a Pandas data frame. The following excerpt is the code that I am using:
import mysql.connector as sql import pandas as pd db_connection = sql.connect(host='hostname', database='db_name', user='username', password='password') db_cursor = db_connection.cursor() db_cursor.execute('SELECT * FROM table_name') table_rows = db_cursor.fetchall() df = pd.DataFrame(table_rows)
When I print the data frame it does properly represent the data but my question is, is it possible to also keep the column names? Here is an example output:
0 1 2 3 4 5 6 7 8 0 :ID[giA0CqQcx+(9kbuSKV== NaN NaN None None None None None None 1 lXB+jIS)DN!CXmj>0(P8^]== NaN NaN None None None None None None 2 lXB+jIS)DN!CXmj>0(P8^]== NaN NaN None None None None None None 3 lXB+jIS)DN!CXmj>0(P8^]== NaN NaN None None None None None None 4 lXB+jIS)DN!CXmj>0(P8^]== NaN NaN None None None None None None
What I would like to do is keep the column name, which would replace the pandas column indexes. For example, instead of having 0, the column name would be: "First_column" as in the MySQL table. Is there a good way to go about this? or is there a more efficient approach of importing data from MySQL into a Pandas data frame than mine?
-
MaxU - stop genocide of UA almost 8 yearswhy don't you use pd.read_sql()?
-
kneewarp about 6 yearsThe question here is related to MySQL db - and not SQLalchemy - as asked in the duplicate.
pd.read_sql()
does not support mysql connection. This question should not be marked as a duplicate. To answer the query:df = pd.DataFrame(table_rows, columns=db_cursor.column_names)
will do what is asked. -
kainC almost 6 years@kneewarp you should post this as an answer. The accepted answer will not work with a MySQL connection, which the OP requested.
-
-
HaMi almost 5 yearsIn my case this worked, but couldn't query the table directly anymore until I closed the connection:
db_connection.close()
-
chaikov over 4 yearsaccording to stackoverflow.com/questions/42118750/…. I've decided to use MySQLdb instead, how to accomplish this in MySQLdb?
-
Yogesh Awdhut Gadade over 4 yearsOne can also use mysql.connect to connect the database (instead of importing two packages sqlalchemy & pymysql) and then can use pd.read_sql function
-
Aman Khandelwal over 4 yearsdb_connection.close() gives a error and the mysql server cannot be connected
-
Brad123 about 4 yearsto close connection: >>> db_connection.dispose()