Is a BLOB converted using the current/default charset in MySQL?

10,797

Short Answer:

Simply delete or comment out line below, and it will always work, no matter which database encoding is really in use (utf8, latin1, etc):

$pdo->exec('SET CHARACTER SET utf8');

Long Answer:

This is not PDO bug, this is MySQL bug.

When actual database encoding is latin1, but you use:

SET CHARACTER SET utf8

(or vice versa: actual is utf8, but you use latin1 - important part is that it is different), then, as far as I can tell, MySQL will try to perform charset conversion for all traffic between client and server (even for BLOB!).

If you DO NOT use SET CHARACTER SET statement, from what I see for scripts (PHP/PDO or Perl/DBI) connection charset by default is set to be the database charset, and in that case no implicit conversion takes place.

Obviously, this automatic conversion is what kills BLOBs, which do not want any conversion to happen.

I have tested this on both PHP/PDO and Perl/DBI, and issue is easily reproducible: both will fail if using database with latin1 encoding and using SET CHARACTER SET utf8 (or vice versa).

If you want to be fully UTF8 compatible, you should change encoding of your database using:

ALTER DATABASE mydb CHARSET utf8;

With this, everything will be using UTF8, and BLOBs will also work fine.

Minimal file that causes this corruption problem is blob.bin with single byte 0xFF. On Linux, you can create this test file using printf command:

printf "0xFF" > blob.bin

Now, test scripts that reproduce the problem:

PHP test code:

<?php
$dbh = new PDO("mysql:host=127.0.0.1;dbname=test");
# If database encoding is NOT utf8, uncomment to break it:
# $dbh->exec("SET CHARACTER SET utf8");

$blob1 = file_get_contents("blob.bin");
$sth = $dbh->prepare(
    "INSERT INTO pdo_blob (the_blob) VALUES(:the_blob)"
);
$sth->bindParam(":the_blob", $blob1, PDO::PARAM_LOB);
$sth->execute();

$sth = $dbh->prepare(
    "SELECT the_blob FROM pdo_blob ORDER BY id DESC LIMIT 1"
);
$sth->execute();

$blob2 = null;
$sth->bindColumn(1, $blob2, PDO::PARAM_LOB);
$sth->fetch();

if ($blob1 == $blob2) {
    echo "Equal\n";
} else {
    echo "Not equal\n";
    $arr1 = str_split($blob1);
    $arr2 = str_split($blob2);
    $i=0;
    for ($i=0; $i<count($arr1); $i++) {
        if ($arr1[$i] != $arr2[$i]) {
            echo "First diff: " . dechex(ord($arr1[$i])) . " != "
                                . dechex(ord($arr2[$i])) . "\n";
            break;
        }
    }
}
?>

Perl test code:

#!/usr/bin/perl -w

use strict;
use DBI qw(:sql_types);

my $dbh = DBI->connect("dbi:mysql:host=127.0.0.1;dbname=test");
# If database encoding is NOT utf8, uncomment to break it:
# $dbh->do("SET CHARACTER SET utf8");
open FILE, "blob.bin";
binmode FILE;
read(FILE, my $blob1, 100000000);
close FILE;
my $sth = $dbh->prepare(
    "INSERT INTO pdo_blob (the_blob) VALUES(?)"
);
$sth->bind_param(1, $blob1, SQL_BLOB);
$sth->execute();
my ($blob2) = $dbh->selectrow_array(
    "SELECT the_blob FROM pdo_blob ORDER BY id DESC LIMIT 1"
);
print ($blob1 eq $blob2 ? "Equal" : "Not equal") , "\n";
Share:
10,797
user151851
Author by

user151851

Updated on June 28, 2022

Comments

  • user151851
    user151851 almost 2 years
    1. I have a table with a BLOB field.
    2. The charset of the table is Latin1.
    3. I connect to the DB and "SET CHARACTER SET utf8".
    4. Then I save binary data into the field.
    5. Then I retrieve the data, and it's not what I saved (corrupt).

    The code:

    <?php
    $pdo = new \PDO("mysql:host=127.0.0.1;dbname=***", '***', '***');
    
    $pdo->exec('SET CHARACTER SET utf8');
    
    $sql = "INSERT INTO pdo_blob (the_blob) VALUES(:the_blob)";
    $insertStm = $pdo->prepare($sql);
    
    $blob = (binary) file_get_contents('/home/***/test.pdf');
    $insertStm->bindParam(":the_blob", $blob, \PDO::PARAM_LOB);
    $insertStm->execute();
    
    $selectStm = $pdo->prepare("SELECT the_blob FROM pdo_blob ORDER BY id DESC LIMIT 1");
    $selectStm->execute();
    
    $savedBlob = null;
    $selectStm->bindColumn(1, $savedBlob, \PDO::PARAM_LOB);
    $selectStm->fetch();
    
    echo 'equal: ' . ((int) ($blob == $savedBlob));
    
  • user151851
    user151851 about 11 years
    Thanks for the answer. It's a way to go but I need to use PDO. I'd like to know why it happens too.