OpenCSV not escaping the quotes(")

15,175

Solution 1

As the CSV format specifies the quotes(") if its inside a field we need to precede it by another quote("). So this solved my problem.

123|Bhajji|Maga|39|"I said Hey|"" I am ""5|'10."|"I a do ""you""|get that"

Refrence: https://www.ietf.org/rfc/rfc4180.txt

Solution 2

from the source code of com.opencsv:opencsv:

  /**
     * Constructs CSVReader.
     *
     * @param reader    the reader to an underlying CSV source.
     * @param separator the delimiter to use for separating entries
     * @param quotechar the character to use for quoted elements
     * @param escape    the character to use for escaping a separator or quote
     */

    public CSVReader(Reader reader, char separator,
                     char quotechar, char escape) {
        this(reader, separator, quotechar, escape, DEFAULT_SKIP_LINES, CSVParser.DEFAULT_STRICT_QUOTES);
    }

see http://sourceforge.net/p/opencsv/source/ci/master/tree/src/main/java/com/opencsv/CSVReader.java

There is a constructor with an additional parameter escape which allows to escape separators and quotes (as per the javadoc).

Share:
15,175
BhajjiMaga
Author by

BhajjiMaga

Updated on June 04, 2022

Comments

  • BhajjiMaga
    BhajjiMaga almost 2 years

    I have a CSV file which will have delimiter or unclosed quotes inside a quotes, How do i make CSVReader ignore the quotes and delimiters inside quotes. For example:

    123|Bhajji|Maga|39|"I said Hey|" I am "5|'10."|"I a do "you"|get that"
    

    This is the content of file.

    The below program to read the csv file.

    @Test
    public void readFromCsv() throws IOException {
        FileInputStream fis = new FileInputStream(
                "/home/netspurt/awesomefile.csv");
        InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
        CSVReader reader = new CSVReader(isr, '|', '\"');
    
        for (String[] row; (row = reader.readNext()) != null;) {
            System.out.println(Arrays.toString(row));
        }
        reader.close();
        isr.close();
        fis.close();
    }
    

    I get the o/p something like this.

    [123, Bhajji, Maga, 39, I said Hey| I am "5|'10., I am an idiot do "you|get that]
    

    what happened to quote after you

    Edit: The Opencsv dependency com.opencsv opencsv 3.4

    • Remigius Stalder
      Remigius Stalder almost 9 years
      which OpenCSV are you using? com.opencsv:opencsv / au.com.bytecode:opencsv / net.sf.opencsv:opencsv ?
    • BhajjiMaga
      BhajjiMaga almost 9 years
      @RemigiusStalder: Please see now
  • BhajjiMaga
    BhajjiMaga almost 9 years
    Well if i put both quotechar and escape as ' \" ' it gives me an exception both cannot be same
  • Remigius Stalder
    Remigius Stalder almost 9 years
    try to make it a backslash, as in CSVReader reader = new CSVReader(isr, '|', '\"', '\\');
  • Remigius Stalder
    Remigius Stalder almost 9 years
    If I understand correctly, you have mentioned the result of the processing you get. But what is the result you want to achieve? Same splits but with the quote between you and |? Or different splitting? To be honest, this looks like a bug in CSVParser, as it should either treat quotes as field delimiter or leave them as is, which is both not the case for the swallowed quote between you and |.
  • Remigius Stalder
    Remigius Stalder almost 9 years
    I have isolated the missing quote to two simple cases: 1: ["I "y"|h"] and 2: ["I"y"|h"] (the square brackets should be removed). The second one just without the space after I - which strangely swallows even both quotes around the y. Each of these corner cases should imho - with current interpretation of the default parameters - parse to one single field identical to the input line.