Listings in Latex with UTF-8 (or at least german umlauts)

encoding latex utf-8 diacritics listings

67,330

Solution 1

ok, found kinda workaround now:

instead of listings package, use listingsutf8

\usepackage{listingsutf8}
copy listings.sty to the folder the document resides

find the following lines

\lst@CCPutMacro
    \lst@ProcessOther {"23}\#
    \lst@ProcessLetter{"24}\textdollar
    \lst@ProcessOther {"25}\%
    \lst@ProcessOther {"26}\&

Enter there the following lines (each "registers" one umlaut)

\lst@ProcessLetter{"E4}{\"a}
\lst@ProcessLetter{"F6}{\"o}
\lst@ProcessLetter{"FC}{\"u}
\lst@ProcessLetter{"C4}{\"A}
\lst@ProcessLetter{"D6}{\"O}
\lst@ProcessLetter{"DC}{\"U}
\lst@ProcessLetter{"DF}{\ss{}}

Save the file

Use

\lstset{
    extendedchars=\true,
    inputencoding=utf8/latin1
}

to enable utf8 character to latin1 character mapping

Convert line endings of your source file from windows (\r\n) to unix (\n)
enjoy

I know this is ugly in many way, but its the only solution that works for me so far.

Solution 2

I found a simpler approach, which works for me:

\usepackage{listings}

\lstset{
  literate={ö}{{\"o}}1
           {ä}{{\"a}}1
           {ü}{{\"u}}1
}

Solution 3

For comments only, you can use the texcl option:

\lstset{language=C++,texcl=true}

Than your comments become Latex and you can use "special" characters

\begin{lstlisting}
int iLink = 0x01; // Paramètre entrée
\end{lstlisting}

Solution 4

This should work for other languages (Spanish, Danish) as well:

\documentclass[
a4paper, %% defines the paper size: a4paper (default), a5paper, letterpaper, ...
12pt %% set default font size to 12 point
]{scrartcl} %% article, see KOMA documentation (scrguide.dvi)

\usepackage[utf8]{inputenc}

\usepackage[T1]{fontenc}
\usepackage{listings}

\lstset{language=Pascal}
\lstset{literate=%
{Ö}{{\"O}}1
{Ä}{{\"A}}1
{Ü}{{\"U}}1
{ß}{{\ss}}2
{ü}{{\"u}}1
{ä}{{\"a}}1
{ö}{{\"o}}1
}

\begin{document}

[Latex: kann man Umlaute in lstlisting verwenden?]
\begin{lstlisting}
Test für Umlaut äöü ÄÖÜ ß So geht es
\end{lstlisting}

\end{document}

Solution 5

My contribution for Czech language.

\lstset{
    inputencoding=utf8,
    extendedchars=true,
    literate=%
    {á}{{\'a}}1
    {č}{{\v{c}}}1
    {ď}{{\v{d}}}1
    {é}{{\'e}}1
    {ě}{{\v{e}}}1
    {í}{{\'i}}1
    {ň}{{\v{n}}}1
    {ó}{{\'o}}1
    {ř}{{\v{r}}}1
    {š}{{\v{s}}}1
    {ť}{{\v{t}}}1
    {ú}{{\'u}}1
    {ů}{{\r{u}}}1
    {ý}{{\'y}}1
    {ž}{{\v{z}}}1
    {Á}{{\'A}}1
    {Č}{{\v{C}}}1
    {Ď}{{\v{D}}}1
    {É}{{\'E}}1
    {Ě}{{\v{E}}}1
    {Í}{{\'I}}1
    {Ň}{{\v{N}}}1
    {Ó}{{\'O}}1
    {Ř}{{\v{R}}}1
    {Š}{{\v{S}}}1
    {Ť}{{\v{T}}}1
    {Ú}{{\'U}}1
    {Ů}{{\r{U}}}1
    {Ý}{{\'Y}}1
    {Ž}{{\v{Z}}}1
}

View more solutions

67,330

Author by

scrub

Updated on December 10, 2021

Comments

scrub over 2 years
Trying to include a source-file into my latex document using the listings package, i got problems with german umlauts inside of the comments in the code. Using
```
\lstset{
extendedchars=\true,
inputencoding=utf8x
}
```
Umlauts in the source files (encoded in UTF-8 without BOM) are processed, but they are somehow moved to the beginning of the word they are contained in. So
```
// die Größe muss berücksichtigt werden
```
in the input source file, becomes
```
// die ößGre muss übercksichtigt werden
```
in the output file.

NOTE: since i found errors in my initial setup, i heavily edited this question
scrub almost 15 years

My main document is in utf8. (and it works, i can even write äöü in the main document)
James almost 15 years

listings does its character processing differently than the main document. So inputenc doesn't help, here; the listings packages needs to support utf8 input explicitly (hence listingsutf8).
Vanuan over 14 years

I think, 'extendedchars=\true' is equal to 'extendedchars=false'.
GDR almost 14 years

Thank you - it worked! The same for Polish language: \lstset{literate={ą}{{\k{a}}}1 {ł}{{\l{}}}1 {ń}{{\'n}}1 {ę}{{\k{e}}}1 {ś}{{\'s}}1 {ż}{{\.z}}1 {ó}{{\'o}}1 {ź}{{\'z}}1 {Ą}{{\k{A}}}1 {Ł}{{\L{}}}1 {Ń}{{\'N}}1 {Ę}{{\k{E}}}1 {Ś}{{\'S}}1 {Ż}{{\.Z}}1 {Ó}{{\'O}}1 {Ź}{{\'Z}}1 }
Martin Thoma over 13 years

I copied listings.sty to listingsutf8.sty in /usr/share/texmf-texlive/tex/latex/listings/ on Ubuntu 10.10. I edited the file, but my listings don't work.
Chielus about 13 years

this works fine for me, package listingsutf8 is not needed. Best workaround!
petrichor about 13 years

It also works for Turkish. Here is the related code snippet: \lstset{ literate={â}{{\^{a}}}1 {Â}{{\^{A}}}1 {ç}{{\c{c}}}1 {Ç}{{\c{C}}}1 {ğ}{{\u{g}}}1 {Ğ}{{\u{G}}}1 {ı}{{\i}}1 {İ}{{\.{I}}}1 {ö}{{\"o}}1 {Ö}{{\"O}}1 {ş}{{\c{s}}}1 {Ş}{{\c{S}}}1 {ü}{{\"u}}1 {Ü}{{\"U}}1 }
przemoc about 13 years

And thank you, GDR! It was a time saver. You only forgot ć and Ć. Here is the full list (bonus: sorted) for quick Ctrl+C + Ctrl+V for others: \lstset{literate=% {ą}{{\k{a}}}1 {ć}{{\'c}}1 {ę}{{\k{e}}}1 {ł}{{\l{}}}1 {ń}{{\'n}}1 {ó}{{\'o}}1 {ś}{{\'s}}1 {ż}{{\.z}}1 {ź}{{\'z}}1 {Ą}{{\k{A}}}1 {Ć}{{\'C}}1 {Ę}{{\k{E}}}1 {Ł}{{\L{}}}1 {Ń}{{\'N}}1 {Ó}{{\'O}}1 {Ś}{{\'S}}1 {Ż}{{\.Z}}1 {Ź}{{\'Z}}1 } (obviously comments don't have newlines, so after pasting you have to fix it (e.g. in vim: :.s/ /\r/g)
Simon over 12 years

Thank you - good solution! Anyway, it should be {ß}{{\ss}}1, because "ß" takes only 1 character in the output ;)
Jan Špaček over 11 years

This is one of the most elegant solutions here, needs more upvotes! :)
Jan Špaček over 11 years

The solution using texcl=true described in another answer seems to be more elegant.
Eduardo Santana almost 11 years

I have the same problem, I want to have a keyword with accents. Did any one did it?
Njaal Gjerde almost 11 years

Someone knows how this solution would look like for the norwegian characters æøå?
Elmar Zander over 9 years

Wow! Then you can even put math formulas into the listings, e.g. double pi = 3.141; // This is $\pi$ or double d = 1.0 // $3 \int_0^1 x^2 dx$. This is really cool!
zbr about 8 years

Thank you! :) Please note that I had to remove the inputencoding=utf8, and extendedchars=true, lines and also the % after literate= for it to work in my case.
WerWet over 6 years

Awesome! The other option (with literate) don't work with XeLaTex.
anion over 4 years

i do not think this is a solution or workaround because this does not help if you have long (maybe generated) content with arbitrary special characters.
user202729 over 3 years

It seems that if there's $ or similar in some comment, they will be interpreted as math formula (which may be unintended, and causes many bugs if they're malformed)