How to fetch and print utf-8 data from mysql DB using Python?
Your problem is with your terminal (sys.stdout
) encoding (cf http://en.wikipedia.org/wiki/Code_page_862), which depends on your system's settings. The best solution (as explained here : https://stackoverflow.com/a/15740694/41316) is to explicitely encode your unicode data before printing them to sys.stdout
.
If you can't use a more usable encoding (utf-8 comes to mind, as it has been designed to handle all unicode characters), you can at least use an alternative error handling like "replace" (replaces non-encodable characters with '?') or "ignore" (suppress non-encodable characters).
Here's a corrected version of your code, you can play with the encoding
and on_error
settings to find out what solution works for you:
import sys
import MySQLdb
# set desired output encoding here
# it looks like your default encoding is "cp862"
# but you may want to first try 'utf-8' first
# encoding = "cp862"
encoding = "utf-8"
# what do when we can't encode to the desired output encoding
# options are:
# - 'strict' : raises a UnicodeEncodeError (default)
# - 'replace': replaces missing characters with '?'
# - 'ignore' : suppress missing characters
on_error = "replace"
db = MySQLdb.connect(
"localhost","matan","pass","youtube",
charset='utf8',
use_unicode=True
)
cursor = db.cursor()
sql = "SELECT * FROM VIDEOS"
try:
cursor.execute(sql)
for i, row in enumerate(cursor):
try:
# encode unicode data to the desired output encoding
title = row[0].encode(encoding, on_error)
link = row[1].encode(encoding, on_error)
except UnicodeEncodeError as e:
# only if on_error='strict'
print >> sys.stderr, "failed to encode row #%s - %s" % (i, e)
else:
print "title=%s\nlink=%s\n\n" % (title, link))
finally:
cursor.close()
db.close()
NB : you may also want to read this (specially the comments) http://drj11.wordpress.com/2007/05/14/python-how-is-sysstdoutencoding-chosen/ for more on Python, strings, unicode, encoding, sys.stdout
and terminal issues.
Admin
Updated on June 04, 2022Comments
-
Admin almost 2 years
I'm having problems with reading utf-8 data from a MySQL DB using Python. My DB contains one table named
Videos
, and the table contains at least one row that has Unicode chars, i.e.[KR] Samsung Galaxy Beam 2 간단 리뷰 [4K]
The collation of the table is
utf8_general_ci
, just like the collation of the fields in the table.This is the code I wrote in order to fetch all the data from my table:
# Open database connection db = MySQLdb.connect("localhost","matan","pass","youtube", charset = 'utf8',use_unicode=True) # prepare a cursor object using cursor() method cursor = db.cursor() # Prepare SQL query to INSERT a record into the database. sql = "SELECT * FROM VIDEOS" try: # Execute the SQL command cursor.execute(sql) # Fetch all the rows in a list of lists. results = cursor.fetchall() for row in results: title = row[0] link = row[1] # Now print fetched result print ("title=%s\nlink=%s\n\n" % \ (title, link)) except: print "Error: unable to fecth data" # disconnect from server db.close()
When I run the above code, it prints all the rows that contain only "ascii" chars, but when it gets to a row that contains Unicode char (i.e. the line I mentioned above), it prints :
File "C:\Users\Matan\Dropbox\Code\Python\youtube.py", line 28, in printall (title, link)) File "C:\Python27\lib\encodings\cp862.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode characters in position 33-34: c haracter maps to <undefined>
and doesn't continue to the next row.
I'm using PhpMyAdmin version 4.1.14, MySQL version 5.6.17, and Python version 2.7.8 .
Edit: I dropped the except clause, and updated the error I've got.