How to insert (raw bytes from file data) using a plain text script

10,074

The easiest way, IMO, to represent bytea data in an SQL file is to use the hex format:

8.4.1. bytea Hex Format

The "hex" format encodes binary data as 2 hexadecimal digits per byte, most significant nibble first. The entire string is preceded by the sequence \x (to distinguish it from the escape format). In some contexts, the initial backslash may need to be escaped by doubling it, in the same cases in which backslashes have to be doubled in escape format; details appear below. The hexadecimal digits can be either upper or lower case, and whitespace is permitted between digit pairs (but not within a digit pair nor in the starting \x sequence). The hex format is compatible with a wide range of external applications and protocols, and it tends to be faster to convert than the escape format, so its use is preferred.

Example:

SELECT E'\\xDEADBEEF';

Converting an array of bytes to hex should be trivial in any language that a sane person (such a yourself) would use to write the SQL file generator.

Share:
10,074

Related videos on Youtube

axiopisty
Author by

axiopisty

SOreadytohelp

Updated on June 04, 2022

Comments

  • axiopisty
    axiopisty almost 2 years

    Database: Postgres 9.1

    I have a table called logos defined like this:

    create type image_type as enum ('png');
    create table logos (
      id UUID primary key,
      bytes bytea not null,
      type image_type not null,
      created timestamp with time zone default current_timestamp not null
    );
    create index logo_id_idx on logos(id);
    

    I want to be able to insert records into this table in 2 ways.

    The first (and most common) way rows will be inserted in the table will be that a user will provide a PNG image file via an html file upload form. The code processing the request on the server will receive a byte array containing the data in the PNG image file and insert a record in the table using something very similar to what is explained here. There are plenty of example of how to insert byte arrays into a postgresql field of type bytea on the internet. This is an easy exercise. An example of the insert code would look like this:

    insert into logos (id, bytes, type, created) values (?, ?, ?, now()) 
    

    And the bytes would be set with something like:

    ...
    byte[] bytes = ... // read PNG file into a byte array.
    ...
    ps.setBytes(2, bytes);
    ...
    

    The second way rows will be inserted in the table will be from a plain text file script. The reason this is needed is only to populate test data into the table for automated tests, or to initialize the database with a few records for a remote development environment.

    Regardless of how the data is entered in the table, the application will obviously need to be able to select the bytea data from the table and convert it back into a PNG image.


    Question

    How does one properly encode a byte array, to be able to insert the data from within a script, in such a way that only the original bytes contained in the file are stored in the database?

    I can write code to read the file and spit out insert statements to populate the script. But I don't know how to encode the byte array for the plain text script such that when running the script from psql the image data will be the same as if the file was inserted using the setBytes jdbc code.

    I would like to run the script with something like this:

    psql -U username -d dataBase -a -f test_data.sql
    
  • axiopisty
    axiopisty about 10 years
    It looks like this might work for me but for some reason when I retrieve the logo from the database the byte array is twice as long as it should be. Google referred me to the bytea_output parameter but I can't figure out how to set the parameter to hex. Any help is appreciated. I'm new to postgresql.
  • mu is too short
    mu is too short about 10 years
    Are you doing this in Java/JDBC? I don't know how to configure Java for this. This thread suggests that you might need a newer version of the JDBC driver.
  • axiopisty
    axiopisty about 10 years
    Ultimately yes, but the code is really written with scala and anorm. The database is installed with a puppet script. I thought the database was postgres 9.1, as specified in the puppet script but select version() spit out "PostgreSQL 8.4.18 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit" and the jdbc driver (gradle coordinates) is postgresql:postgresql:9.1-901.jdbc4
  • axiopisty
    axiopisty about 10 years
    And connecting from psql gives "WARNING: psql version 9.1, server version 8.4."
  • mu is too short
    mu is too short about 10 years
    Have you tried using the "escape" format? The manual describes it right below the hex format section I linked to. AFAIK the default format for bytea changed from escape to hex around PostgreSQL 9.0 or so.
  • axiopisty
    axiopisty about 10 years
    I read the entire document (postgresql.org/docs/current/interactive/datatype-binary.htm‌​l) Maybe it is trivial, but I still don't understand how to generate the "escape"d string to insert into the database with a script. Would it be possible for someone to provide a java method with signature String byteArrayToPSQLEscapedString(byte[] bytes) {...} and an example insert statement using the results of that method?
  • mu is too short
    mu is too short about 10 years
    Try converting that last comment into another question and make sure you include the "java" tag. Sorry but we've reached the limit of my Java/JDBC knowledge.