Converting webpages from UTF-8 to ISO-8859-1 in linux

16,481

Solution 1

Ubuntu has recode

$ sudo apt-get install recode
$ recode UTF-8..latin1 *.php

Recursively, thanks to Ted Dziuba:

$ find . -name "*.php" -exec recode UTF-8..latin1 {} \;

Solution 2

I think iconv is your answer...

Form man iconv:

  NAME
      iconv - Convert encoding of given files from one encoding to another

  SYNOPSIS
      iconv -f encoding -t encoding inputfile

  DESCRIPTION
      The iconv program converts the encoding of characters in inputfile from one coded 
      character set to another. The result is written to standard output unless otherwise 
      specified by the --output option.

      .....

So you could probably do a

find $my_base_dir -name "*.php" -o -name "*.html" -exec sh -c "( \
   iconv -t ISO88592 -f UTF8 {} -o {}.iconv ; \
   mv {}.iconv {} ; \
)" \;

This will recursively find the appropriately named files and re-encode them (the temporary file is necessary, as iconv will truncate output before starting to work).

Share:
16,481
Lemon
Author by

Lemon

Software Developer, Geek, HSP, SDA, ..., open, honest, careful, perfectionist, ... Currently into indoor rowing and rock climbing, just to mention something non-computer-related... Not the best at bragging about myself... so... not sure what more to write... 🤔

Updated on June 15, 2022

Comments

  • Lemon
    Lemon almost 2 years

    Anyone have a neat trick on how to convert a number of php and html files from UTF-8 to ISO-8859-1 in linux (Ubuntu)?

  • David Z
    David Z about 15 years
    recode is a fairly standard Linux program - not so standard that it's always installed by default, but it should be available on all distributions, not just Ubuntu.
  • Ted Dziuba
    Ted Dziuba about 15 years
    Recursively, it's find . -name "*.php" -exec recode UTF-8..latin1 {}\;
  • Luiz Damim
    Luiz Damim about 14 years
    +1 Found your answer while searching google for this conversion. It saved my day :)
  • Roger
    Roger over 9 years
    To do it with all files in a directory recursevely from ISO to UTF: find "$F" -name "*" -exec recode latin1..UTF-8 {} \; Where $F is the path to the files.
  • Scott
    Scott over 6 years
    To confirm which @DavidZ said - it's also available in Cygwin