Listings in Latex with UTF-8 (or at least german umlauts)
Solution 1
ok, found kinda workaround now:
instead of listings package, use listingsutf8
\usepackage{listingsutf8}
copy listings.sty to the folder the document resides
find the following lines
\lst@CCPutMacro \lst@ProcessOther {"23}\# \lst@ProcessLetter{"24}\textdollar \lst@ProcessOther {"25}\% \lst@ProcessOther {"26}\&
Enter there the following lines (each "registers" one umlaut)
\lst@ProcessLetter{"E4}{\"a} \lst@ProcessLetter{"F6}{\"o} \lst@ProcessLetter{"FC}{\"u} \lst@ProcessLetter{"C4}{\"A} \lst@ProcessLetter{"D6}{\"O} \lst@ProcessLetter{"DC}{\"U} \lst@ProcessLetter{"DF}{\ss{}}
Save the file
Use
\lstset{ extendedchars=\true, inputencoding=utf8/latin1 }
to enable utf8 character to latin1 character mapping
- Convert line endings of your source file from windows (\r\n) to unix (\n)
- enjoy
I know this is ugly in many way, but its the only solution that works for me so far.
Solution 2
I found a simpler approach, which works for me:
\usepackage{listings}
\lstset{
literate={ö}{{\"o}}1
{ä}{{\"a}}1
{ü}{{\"u}}1
}
Solution 3
For comments only, you can use the texcl
option:
\lstset{language=C++,texcl=true}
Than your comments become Latex and you can use "special" characters
\begin{lstlisting}
int iLink = 0x01; // Paramètre entrée
\end{lstlisting}
Solution 4
This should work for other languages (Spanish, Danish) as well:
\documentclass[
a4paper, %% defines the paper size: a4paper (default), a5paper, letterpaper, ...
12pt %% set default font size to 12 point
]{scrartcl} %% article, see KOMA documentation (scrguide.dvi)
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{listings}
\lstset{language=Pascal}
\lstset{literate=%
{Ö}{{\"O}}1
{Ä}{{\"A}}1
{Ü}{{\"U}}1
{ß}{{\ss}}2
{ü}{{\"u}}1
{ä}{{\"a}}1
{ö}{{\"o}}1
}
\begin{document}
[Latex: kann man Umlaute in lstlisting verwenden?]
\begin{lstlisting}
Test für Umlaut äöü ÄÖÜ ß So geht es
\end{lstlisting}
\end{document}
Solution 5
My contribution for Czech language.
\lstset{
inputencoding=utf8,
extendedchars=true,
literate=%
{á}{{\'a}}1
{č}{{\v{c}}}1
{ď}{{\v{d}}}1
{é}{{\'e}}1
{ě}{{\v{e}}}1
{í}{{\'i}}1
{ň}{{\v{n}}}1
{ó}{{\'o}}1
{ř}{{\v{r}}}1
{š}{{\v{s}}}1
{ť}{{\v{t}}}1
{ú}{{\'u}}1
{ů}{{\r{u}}}1
{ý}{{\'y}}1
{ž}{{\v{z}}}1
{Á}{{\'A}}1
{Č}{{\v{C}}}1
{Ď}{{\v{D}}}1
{É}{{\'E}}1
{Ě}{{\v{E}}}1
{Í}{{\'I}}1
{Ň}{{\v{N}}}1
{Ó}{{\'O}}1
{Ř}{{\v{R}}}1
{Š}{{\v{S}}}1
{Ť}{{\v{T}}}1
{Ú}{{\'U}}1
{Ů}{{\r{U}}}1
{Ý}{{\'Y}}1
{Ž}{{\v{Z}}}1
}
scrub
Updated on December 10, 2021Comments
-
scrub over 2 years
Trying to include a source-file into my latex document using the listings package, i got problems with german umlauts inside of the comments in the code. Using
\lstset{ extendedchars=\true, inputencoding=utf8x }
Umlauts in the source files (encoded in UTF-8 without BOM) are processed, but they are somehow moved to the beginning of the word they are contained in. So
// die Größe muss berücksichtigt werden
in the input source file, becomes
// die ößGre muss übercksichtigt werden
in the output file.
NOTE: since i found errors in my initial setup, i heavily edited this question
-
scrub almost 15 yearsMy main document is in utf8. (and it works, i can even write äöü in the main document)
-
James almost 15 yearslistings does its character processing differently than the main document. So inputenc doesn't help, here; the listings packages needs to support utf8 input explicitly (hence listingsutf8).
-
Vanuan over 14 yearsI think, 'extendedchars=\true' is equal to 'extendedchars=false'.
-
GDR almost 14 yearsThank you - it worked! The same for Polish language: \lstset{literate={ą}{{\k{a}}}1 {ł}{{\l{}}}1 {ń}{{\'n}}1 {ę}{{\k{e}}}1 {ś}{{\'s}}1 {ż}{{\.z}}1 {ó}{{\'o}}1 {ź}{{\'z}}1 {Ą}{{\k{A}}}1 {Ł}{{\L{}}}1 {Ń}{{\'N}}1 {Ę}{{\k{E}}}1 {Ś}{{\'S}}1 {Ż}{{\.Z}}1 {Ó}{{\'O}}1 {Ź}{{\'Z}}1 }
-
Martin Thoma over 13 yearsI copied listings.sty to listingsutf8.sty in /usr/share/texmf-texlive/tex/latex/listings/ on Ubuntu 10.10. I edited the file, but my listings don't work.
-
Chielus about 13 yearsthis works fine for me, package listingsutf8 is not needed. Best workaround!
-
petrichor about 13 yearsIt also works for Turkish. Here is the related code snippet:
\lstset{ literate={â}{{\^{a}}}1 {Â}{{\^{A}}}1 {ç}{{\c{c}}}1 {Ç}{{\c{C}}}1 {ğ}{{\u{g}}}1 {Ğ}{{\u{G}}}1 {ı}{{\i}}1 {İ}{{\.{I}}}1 {ö}{{\"o}}1 {Ö}{{\"O}}1 {ş}{{\c{s}}}1 {Ş}{{\c{S}}}1 {ü}{{\"u}}1 {Ü}{{\"U}}1 }
-
przemoc about 13 yearsAnd thank you, GDR! It was a time saver. You only forgot ć and Ć. Here is the full list (bonus: sorted) for quick Ctrl+C + Ctrl+V for others: \lstset{literate=% {ą}{{\k{a}}}1 {ć}{{\'c}}1 {ę}{{\k{e}}}1 {ł}{{\l{}}}1 {ń}{{\'n}}1 {ó}{{\'o}}1 {ś}{{\'s}}1 {ż}{{\.z}}1 {ź}{{\'z}}1 {Ą}{{\k{A}}}1 {Ć}{{\'C}}1 {Ę}{{\k{E}}}1 {Ł}{{\L{}}}1 {Ń}{{\'N}}1 {Ó}{{\'O}}1 {Ś}{{\'S}}1 {Ż}{{\.Z}}1 {Ź}{{\'Z}}1 } (obviously comments don't have newlines, so after pasting you have to fix it (e.g. in vim:
:.s/ /\r/g
) -
Simon over 12 yearsThank you - good solution! Anyway, it should be
{ß}{{\ss}}1
, because "ß" takes only 1 character in the output ;) -
Jan Špaček over 11 yearsThis is one of the most elegant solutions here, needs more upvotes! :)
-
Jan Špaček over 11 yearsThe solution using
texcl=true
described in another answer seems to be more elegant. -
Eduardo Santana almost 11 yearsI have the same problem, I want to have a keyword with accents. Did any one did it?
-
Njaal Gjerde almost 11 yearsSomeone knows how this solution would look like for the norwegian characters æøå?
-
Elmar Zander over 9 yearsWow! Then you can even put math formulas into the listings, e.g.
double pi = 3.141; // This is $\pi$
ordouble d = 1.0 // $3 \int_0^1 x^2 dx$
. This is really cool! -
zbr about 8 yearsThank you! :) Please note that I had to remove the
inputencoding=utf8,
andextendedchars=true,
lines and also the%
afterliterate=
for it to work in my case. -
WerWet over 6 yearsAwesome! The other option (with
literate
) don't work with XeLaTex. -
anion over 4 yearsi do not think this is a solution or workaround because this does not help if you have long (maybe generated) content with arbitrary special characters.
-
user202729 over 3 yearsIt seems that if there's
$
or similar in some comment, they will be interpreted as math formula (which may be unintended, and causes many bugs if they're malformed)