Java remove all non alphanumeric character from beginning and end of string
Solution 1
Use ^
(matches at the beginning of the string) and $
(matches at the end) anchors:
s = s.replaceAll("^[^a-zA-Z0-9\\s]+|[^a-zA-Z0-9\\s]+$", "");
Solution 2
Use:
s.replaceAll("^[^\\p{L}^\\p{N}\\s%]+|[^\\p{L}^\\p{N}\\s%]+$", "")
Instead of:
s.replaceAll("^[^a-zA-Z0-9\\s]+|[^a-zA-Z0-9\\s]+$", "")
Where p{L}
is any kind of letter from any language.
And p{N}
is any kind of numeric character in any script.
For use in Latin-based scripts, when non-English languages are needed, like Spanish, for instance: éstas, apuntó; will in the latter become; stas and apunt. The former also works on non-Latin based languages.
For all Indo-European Languages, add p{Mn}
for Arabic and Hebrew vowels:
s.replaceAll("^[^\\p{L}^\\p{N}^\\p{Mn}\\s%]+|[^\\p{L}^\\p{N}^\\p{Mn}\\s%]+$", "")
For Dravidian languages, the vowels may surround the consonant - as opposed to Semitic languages where they are "within" the character - like ಾ. For this use p{Me}
instead. For all languages use:
s.replaceAll("^[^\\p{L}^\\p{N}^\\p{M}\\s%]+|[^\\p{L}^\\p{N}^\\p{M}\\s%]+$", "")
See regex tutorial for a list of Unicode categories
Mike6679
Updated on June 08, 2022Comments
-
Mike6679 almost 2 years
I know how to replace ALL non alphanumeric chars in a string but how to do it from just beginning and end of the string?
I need this string:
"theString,"
to be:
theString
replace ALL non alphanumeric chars in a string:
s = s.replaceAll("[^a-zA-Z0-9\\s]", "");
-
David Conrad almost 10 yearsWhat's that
\\s
doing in there? I know OP had it, but it was wrong then and it's wrong now. -
falsetru almost 10 years@DavidConrad,
\\s
will match any whitespace character. I thought it was OP's intention to exclude alpha-numeric characters and space characters, so I didn't touch it. -
David Conrad almost 10 yearsExactly, that's why it's wrong. OP said "replace ALL non alphanumeric chars in a string". It's a negated set, so it will replace anything EXCEPT a-z, A-Z, 0-9, and any whitespace character. So it will leave in whitespace.
-
David Conrad almost 10 yearsI think OP was trying to match UP TO a space, and didn't get how sets work. I guess I could be wrong.
-
Mike6679 almost 10 years@falsetru does this strip ALL non alphanumeric from beginning and end of the string or just one in beginning and end?
-
falsetru almost 10 years@Mike, It removes all non alphanumeric + non whitespace from the beginning and the end of the string. (I used
+
). If you want to remove onlyone
, remove+
. -
Mike6679 almost 10 yearsJust a note: Although this did work, I had to replace with my own parser because the regex expression was just too expensive over thousands of iterations.
-
O. Jones over 9 yearsThis removes all the non-alphanumeric characters
-
borchvm over 4 yearsCode-only answers are considered low quality: make sure to provide an explanation what your code does and how it solves the problem. It will help the asker and future readers both if you can add more information in your post. See also Explaining entirely code-based answers: meta.stackexchange.com/questions/114762/…