How to remove the  character from a string in java?
11,507
Solution 1
You have to use the right UTF:
Code example:
String blub = " ~KARACHI¦~~~~~~";
System.out.println(blub);
System.out.println(blub.replaceAll(new String("Â".getBytes("UTF-8"), "UTF-8"), ""));
Output:
~KARACHI¦~~~~~~
~KARACHI¦~~~~~~
See a description similiar to this problem here: Link
Solution 2
I have a simple one liner code, it removes for most of the non-UTF-8 characters. I tested for your character as well i.e. Â.
String myString = "~KARACHI¦~~~~~~";
String result = myString.replaceAll("[^\\x00-\\x7F]","");
System.out.println(result);
You can find complete code here. You may test that as well here.
Author by
Ahmed Junaid
Updated on June 04, 2022Comments
-
Ahmed Junaid over 1 year
I have the following string in which the
Â
special character is coming in hidden. I want to remove only theÂ
from this string~IQBAL~KARACHI¦~~~~~~~~~~~
.Here is a before and after image to show what I mean:
I've tried this code:
responseMessageUTF.replaceAll("\\P{InBasic_Latin}", "");
but this is also replacing the
¦
character. Is there any way to remove only theÂ
character and not the¦
character? -
JB Nizet over 8 yearsWhat's the point of
new String("Â".getBytes("UTF-8"), "UTF-8")
? That's a very convoluted way of writing"Â"
. Also, replaceAll() expects a regex. replace() is more appropriate. -
MrT over 8 yearsI use this if i specify the charset for specific characters i have to parse. Its maybe not the easiest solution, but it works for me in my code.
-
JB Nizet over 8 yearsCharacters don't have a charset. A charset is used to transform characters to bytes and vice versa. No such transformation is needed here. You just want to remove characters from a String, which is a sequence of characters.
blub.replace("Â", "")
works just as fine. -
MrT over 8 yearsYou are right. your solution works well too and is much easier!
-
Jon Skeet over 8 yearsGiven the question, I don't think this is the right approach - it does answer the question asked, but doesn't address the underlying problem of why the rogue character is there to start with. (And your conversion of the string into bytes and back again is utterly pointless.) The OP should work out what the data is meant to represent, and find out why it's not being read properly.