Regex to match only letters
Solution 1
Use a character set: [a-zA-Z]
matches one letter from A–Z in lowercase and uppercase. [a-zA-Z]+
matches one or more letters and ^[a-zA-Z]+$
matches only strings that consist of one or more letters only (^
and $
mark the begin and end of a string respectively).
If you want to match other letters than A–Z, you can either add them to the character set: [a-zA-ZäöüßÄÖÜ]
. Or you use predefined character classes like the Unicode character property class \p{L}
that describes the Unicode characters that are letters.
Solution 2
\p{L}
matches anything that is a Unicode letter if you're interested in alphabets beyond the Latin one
Solution 3
Depending on your meaning of "character":
[A-Za-z]
- all letters (uppercase and lowercase)
[^0-9]
- all non-digit characters
Solution 4
The closest option available is
[\u\l]+
which matches a sequence of uppercase and lowercase letters. However, it is not supported by all editors/languages, so it is probably safer to use
[a-zA-Z]+
as other users suggest
Solution 5
You would use
/[a-z]/gi
[]--checks for any characters between given inputs
a-z---covers the entire alphabet
g-----globally throughout the whole string
i-----getting upper and lowercase

Nike
Updated on July 08, 2022Comments
-
Nike 6 months
How can I write a regex that matches only letters?
-
Philip Potter over 12 yearsnot in all regex flavours. For example, vim regexes treat
\p
as "Printable character". -
Joachim Sauer over 12 yearsThat's a very ASCII-centric solution. This will break on pretty much any non-english text.
-
Philip Potter over 12 yearsthis page suggests only java, .net, perl, jgsoft, XML and XPath regexes support \p{L}. But major omissions: python and ruby (though python has the regex module).
-
Gumbo over 12 years@Joachim Sauer: It will rather break on languages using non-latin characters.
-
Nike over 12 yearsI meant lettters. It doesn't appear to be working though. preg_match('/[a-zA-Z]+/', $name);
-
Ivo Wetzel over 12 yearsAlready breaks on 90% of German text, don't even mention French or Spanish. Italian might still do pretty well though.
-
Joachim Sauer over 12 yearsthat depends on what definition of "latin character" you choose. J, U, Ö, Ä can all be argued to be latin characters or not, based on your definition. But they are all used in languages that use the "latin alphabet" for writing.
-
KristofMols over 12 years[A-Za-z] is just the declaration of characters you can use. You still need to declare howmany times this declaration has to be used: [A-Za-z]{1,2} (to match 1 or 2 letters) or [A-Za-z]{1,*} (to match 1 or more letters)
-
Jörg W Mittag over 12 years@Philip Potter: Ruby supports Unicode character properties using that exact same syntax.
-
Amal Murali over 8 years
\w
may not be a good solution in all cases. At least in PCRE,\w
can match other characters as well. Quoting the PHP manual: "A "word" character is any letter or digit or the underscore character, that is, any character which can be part of a Perl "word". The definition of letters and digits is controlled by PCRE's character tables, and may vary if locale-specific matching is taking place. For example, in the "fr" (French) locale, some character codes greater than 128 are used for accented letters, and these are matched by \w.". -
OGHaza over 8 yearsThat is not what
[^\W|\d]
means -
OGHaza over 8 years
[^\W|\d]
means not\W
and not|
and not\d
. It has the same net effect since|
is part of\W
but the|
does not work as you think it does. Even then that means it accepts the_
character. You are probably looking for[^\W\d_]
-
Motlab over 8 yearsI agree with you, it accepts the
_
. But "NOT"|
is equal than "AND", so[^\W|\d]
means : NOT\W
AND NOT\d
-
OGHaza over 8 years
[^ab]
means nota
and notb
.[^a|b]
means nota
and not|
and notb
. To give a second example[a|b|c|d]
is exactly the same as[abcd|||]
which is exactly the same as[abcd|]
- all of which equate to([a]|[b]|[c]|[d]|[|])
the|
is a literal character, not an OR operator. The OR operator is implied between each character in a character class, putting an actual|
means you want the class to accept the|
(pipe) character. -
V-SHY over 7 yearswords include other characters from letters
-
Nyerguds over 6 yearsWon't match any special characters though.
-
Eugen Konkov over 6 years
\w
means match letters and numbers -
ZoFreX over 6 yearsI think this should be
\p{L}\p{M}*+
to cover letters made up of multiple codepoints, e.g. a letter followed by accent marks. As per regular-expressions.info/unicode.html -
phuclv over 6 yearswell à, á, ã, Ö, Ä... are letters too, so are অ, আ, ই, ঈ, Є, Ж, З, ﺡ, ﺥ, ﺩא, ב, ג, ש, ת, ... en.wikipedia.org/wiki/Letter_%28alphabet%29
-
Radu Simionescu about 6 years\p{L} matches all the umlauts sedilla accents etc, so you should go with that.
-
user1329482 about 5 yearsWorks well in a selector engine for determining if the selector is just a tag name.
-
DaveMongoose almost 5 yearsThis will also match whitespace, symbols, etc. which does not seem to be what the question is asking for.
-
AER almost 5 yearsWhat do you do if you can't use
[]
because Python is too thick to understand nestings? -
The Witness over 4 yearsAnd what about for instance, “Zażółć gęslą jaźń”?
-
karoluS over 4 yearsit doesn't include diacritic signs such as
ŹŻŚĄ
-
matanster over 3 yearswith python 3 this yields an error
bad escape \p at position 0
-
Pablo over 3 yearsInstead of keep adding characters like adding äöüßÄÖÜ, you can go: ^[a-zA-Z]\p{L}+$ to include most of the western alphabets.
-
Catalina Chircu about 3 years@phuclv: Indeed, but that depends on the encoding, and the encoding is part of the settings of the program (either the default config or the one declared in a config file of the program). When I worked on different languages, I used to store that in a constant, in a config file.
-
phuclv about 3 years@CatalinaChircu encoding is absolutely irrelevant here. Encoding is a way to encode a code point in a character set in binary, for example UTF-8 is an encoding for Unicode. Letters OTOH depends on the language, and if one says
[A-Za-z]
are letters then the language that's being used must be specified -
Catalina Chircu about 3 years@phuclv: Indeed, I should have mentioned the language, not the encoding. The language is important and finding the letters in English is not the same as finding the letters in Spanish or French. If you do not take into account the diacritics in these languages you can cut words in two.
-
Stefan Haustein about 3 yearsDoesn't work in firefox: bugzilla.mozilla.org/show_bug.cgi?id=1361876
-
Toto almost 3 yearsYou should have look at an ASCII table.
A-z
matches more than just letters, as well asÀ-ú
-
ndrwnaguib over 2 yearsHello @jarraga. Welcome to SO, did you read how to answer a question?. It should assist the clearance of your answer, and hence avoid down voting.
-
Toto over 2 yearsWhat about non Latin letter? For example
çéàñ
. Your regex is less readable than\p{L}
-
Frederic about 2 yearsClever answer. Works perfectly for accented letters as well.
-
jave.web almost 2 yearsFor letters beyond english:
/\p{Letter}/gu
ref: developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/… -
jave.web almost 2 yearsJavaScript needs
u
after regex to detect the unicode group:/\p{Letter}/gu
-
dimitar.bogdanov over 1 year^ or any Cyrillic letters
-
Eric Soyke about 1 yearFor a long time I had been using [A-z]+ but just noticed this allows a few special characters like ` and [ to slip in. [a-zA-Z]+ is indeed the way to go.