Set encoding when converting text file to pdf using itext

33,524

Solution 1

When I look at your code, I see a number of things that are odd.

  1. You say you require UTF-8, but you create a BaseFont object using BaseFont.CP1252 instead of BaseFont.IDENTITY_H (which is the "encoding" you need when you work with Unicode).
  2. You use the standard Type 1 font Courier, which is a font that doesn't know how to render é,è,à... and a font that is never embedded. As documented, the BaseFont.EMBEDDED parameter is ignored in this case!
  3. You don't use this font with an object that has actual content. The actual content is put into a Paragraph that is created using the default font "Helvetica", a font that doesn't know how to render é,è,à...

To solve this, you need to create the Paragraph with the appropriate font. That is NOT a standard type 1 font, but something like courier.ttf. You also need to use the appropriate encoding: BaseFont.IDENTITY_H.

Solution 2

Both the reader and the writer should be set to use UTF-8 character set encoding to read/write UTF-8 characters properly. For example,

input = new BufferedReader(new InputStreamReader(args[0], "UTF-8"));
Share:
33,524
Amira
Author by

Amira

Updated on August 02, 2022

Comments

  • Amira
    Amira almost 2 years

    I'm working on getting itext to output my UTF-8 encoded text correctly in fact the input file contains symbols like ° and Latin caracters (é,è,à...) .

    But i didn't find a solution this is the code i'm using :

    BufferedReader input = null;
    Document output = null;
    System.out.println("Convert text file to pdf");
    System.out.println("input  : " + args[0]);
    System.out.println("output : " + args[1]);
    try {
      // text file to convert to pdf as args[0]
      input = 
        new BufferedReader (new FileReader(args[0]));
      // letter 8.5x11
      //    see com.lowagie.text.PageSize for a complete list of page-size constants.
      output = new Document(PageSize.LETTER, 40, 40, 40, 40);
      // pdf file as args[1]
      PdfWriter.getInstance(output, new FileOutputStream (args[1]));
    
      output.open();
      output.addAuthor("RealHowTo");
      output.addSubject(args[0]);
      output.addTitle(args[0]);
    
      BaseFont courier = BaseFont.createFont(BaseFont.COURIER, BaseFont.CP1252, BaseFont.EMBEDDED);
      Font font = new Font(courier, 12, Font.NORMAL);
      Chunk chunk = new Chunk("",font);
      output.add(chunk); 
    
      String line = "";
      while(null != (line = input.readLine())) {
        System.out.println(line);
        Paragraph p = new Paragraph(line);
        p.setAlignment(Element.ALIGN_JUSTIFIED);
        output.add(p);
      }
      System.out.println("Done.");
      output.close();
      input.close();
      System.exit(0);
    }
    catch (Exception e) {
      e.printStackTrace();
      System.exit(1);
    }
    }
    

    Any idea will be appreciated.