MySQL import in phpmyadmin (CSV) chokes on quotes

17,663

Solution 1

I've been having a similar problem for the last several hours and I've finally gotten an import to work so I'll share my solution, even though it may not help the original poster.

Short version:
1.) if an Excel file, save as ODS (open document spreadsheet) format. 1a.) If the file is some kind of text format with delimiters (like the original poster has), then open Excel, and once inside Excel use File/Open to open the file. There you will be able to select the appropriate delimiter to view the file. Make sure the file looks alright, THEN save as ODS format (and close the file).

2.) Open the file in OpenOffice Calc (free download from Oracle/Sun).

2a.) Press Ctrl-F to open the Find dialog box. Click More Options and make sure "Current Selection Only" is NOT checked.

2b.) Search for double quotes. If there are none in your file, you can skip steps 4 and 5.

3.) Save As -> Text CSV. Select options for UTF-8 format (press "u" 3 times to get there fast), select ";" (semi colon) as separator, and select double quotes for text.

4.) If there were any double quotes found in your file in step 2b, continue, otherwise just import the file as CSV with phpMyAdmin (see step 6). It should work.

5a.) Open in Word or any other text editor where you can do Find -> Replace All.

5b.) Find all instances of three double quotes in a row by searching for """ (if you do find any, you might even want to search for 4, 5, 6 etc. in a row until you come up empty).

5c.) Replace the """ with a placeholder that is not found anywhere else in your csv. I replaced them with 'abcdefg'.

5d.) Find -> Replace all instances of "" (two double quotes in a row) with " (forward slash and double quote).

5e.) Find -> Replace all instances of abcdefg (or your chosen placeholder from step 5c) with "". 5c and this step ensure that any quotes occuring at the end of a field just before the text-delimiting quote are properly 'escaped'.

5f.) Finally, save the file, keeping in UTF-8 (or whatever format you need for import).

6.a) In phpMyAdmin, click the "import" tab, click the "choose file" button, and select the file you just saved.

6b.) under 'Format of imported file' CSV should be selected. If column names are in the first row, make sure that checkbox is checked. Most importantly, 'Fields terminated by' should be set to ; (semi colon), 'Fields enclosed by' should be set to " (double quotes), and 'Fields escaped by' should be set to \ (forward slash). You set that up in your file by following step 3, and if necessary by following steps 5a - 5f.

7.) Click "Go" and pray you didn't just waste another hour.

Now that the short version has turned out this long, I'll skip the long version.

Suffice it to say, there seem to be 2 major problems with importing through phpmyadmin. 1.) There's some kind of memory problem that prevents large Excel and ODS files (how large is large? not sure yet) being imported. 2.) Neither OpenOffice nor Excel seem to save their csv files in a way that's compatible with phpmyadmin. They want to escape double quotes with double quotes. phpMyAdmin wants double quotes escaped with something else, like forward slash.

The first problem will hopefully be fixed in an update of phpmyadmin (and/or the Excel importing add-on 'PHPExcel'). The second one could be fixed if there was an easy way to change the escape character for Excel or ODS files saved as CSV, or if phpMyAdmin could be made compatible with their format (that should actually be pretty easy. Simply have it perform the same find-replace actions we performed manually above to skirt the double quote problem).

I hope this helps somebody, as I spent 3-4 hours discovering this solution and another hour writing it here. I hope it's not too long, but I was hoping to help people at all levels of expertise from zero to wherever I am (probably around 0.1).

Solution 2

I found a hack that works -- I use the $ as the "enclosed by" character and all is well. Since this is for a European site, I know that they'll never use it in the table content.

Share:
17,663
Andrew Swift
Author by

Andrew Swift

I started out on an IBM mainframe in Palo Alto in 1981, then a PDP-11 in high school in Amherst, Commodore CBM, etc... finished 1/2 a bachelor's degree in CS at Duke, then decided to major in philosophy. Finally ended up doing graphic design, and in the late 90's when web sites became important got back into programming with javascript, perl, php, actionscript, etc. I love the book "The Adolescence of P1", and live in southern France.

Updated on June 04, 2022

Comments

  • Andrew Swift
    Andrew Swift almost 2 years

    I am trying to import a .csv file into a MySQL table via phpMyAdmin.

    The .csv file is separated by pipes, formated like this:

    data|d'ata|d'a"ta|dat"a|
    data|"da"ta|data|da't'a|
    dat'a|data|da"ta"|da'ta|
    

    The data contains quotes. I have no control over the format in which I recieve the data -- it is generated by a third party.

    The problem comes when there is a | followed by a double quote. I always get an "invalid field count in CSV input on line N" error.

    I am uploading the file from the import page, using Latin1, CSV, terminated by |, separated by ".

    I would like to just change the "enclosed by" character, but I keep getting "Invalid parameter for CSV import: Fields enclosed by". I have tried various characters with no success.

    How can I tell MySQL to accept this format in phpMyAdmin?

    Setting up these tables is the first step in writing a program that will use uploaded gzipped .csv files to maintain the catalog of an e-commerce site.