Importing data from a MySQL database into a Pandas data frame including column names

110,178

IMO it would be much more efficient to use pandas for reading data from your MySQL server:

from sqlalchemy import create_engine
import pandas as pd

db_connection_str = 'mysql+pymysql://mysql_user:mysql_password@mysql_host/mysql_db'
db_connection = create_engine(db_connection_str)

df = pd.read_sql('SELECT * FROM table_name', con=db_connection)

this should also take care of column names...

Share:
110,178

Related videos on Youtube

vFlav
Author by

vFlav

Updated on July 08, 2022

Comments

  • vFlav
    vFlav almost 2 years

    I am importing data from a MySQL database into a Pandas data frame. The following excerpt is the code that I am using:

    import mysql.connector as sql
    import pandas as pd
    
    db_connection = sql.connect(host='hostname', database='db_name', user='username', password='password')
    db_cursor = db_connection.cursor()
    db_cursor.execute('SELECT * FROM table_name')
    
    table_rows = db_cursor.fetchall()
    
    df = pd.DataFrame(table_rows)
    

    When I print the data frame it does properly represent the data but my question is, is it possible to also keep the column names? Here is an example output:

                              0   1   2     3     4     5     6     7     8
    0  :ID[giA0CqQcx+(9kbuSKV== NaN NaN  None  None  None  None  None  None
    1  lXB+jIS)DN!CXmj>0(P8^]== NaN NaN  None  None  None  None  None  None   
    2  lXB+jIS)DN!CXmj>0(P8^]== NaN NaN  None  None  None  None  None  None   
    3  lXB+jIS)DN!CXmj>0(P8^]== NaN NaN  None  None  None  None  None  None   
    4  lXB+jIS)DN!CXmj>0(P8^]== NaN NaN  None  None  None  None  None  None   
    

    What I would like to do is keep the column name, which would replace the pandas column indexes. For example, instead of having 0, the column name would be: "First_column" as in the MySQL table. Is there a good way to go about this? or is there a more efficient approach of importing data from MySQL into a Pandas data frame than mine?

    • MaxU - stop genocide of UA
      MaxU - stop genocide of UA almost 8 years
      why don't you use pd.read_sql()?
    • kneewarp
      kneewarp about 6 years
      The question here is related to MySQL db - and not SQLalchemy - as asked in the duplicate. pd.read_sql() does not support mysql connection. This question should not be marked as a duplicate. To answer the query: df = pd.DataFrame(table_rows, columns=db_cursor.column_names) will do what is asked.
    • kainC
      kainC almost 6 years
      @kneewarp you should post this as an answer. The accepted answer will not work with a MySQL connection, which the OP requested.
  • HaMi
    HaMi almost 5 years
    In my case this worked, but couldn't query the table directly anymore until I closed the connection: db_connection.close()
  • chaikov
    chaikov over 4 years
    according to stackoverflow.com/questions/42118750/…. I've decided to use MySQLdb instead, how to accomplish this in MySQLdb?
  • Yogesh Awdhut Gadade
    Yogesh Awdhut Gadade over 4 years
    One can also use mysql.connect to connect the database (instead of importing two packages sqlalchemy & pymysql) and then can use pd.read_sql function
  • Aman Khandelwal
    Aman Khandelwal over 4 years
    db_connection.close() gives a error and the mysql server cannot be connected
  • Brad123
    Brad123 about 4 years
    to close connection: >>> db_connection.dispose()