How to remove the  character from a string in java?

11,507

Solution 1

You have to use the right UTF:

Code example:

String blub = " ~KARACHI¦~~~~~~";
System.out.println(blub);
System.out.println(blub.replaceAll(new String("Â".getBytes("UTF-8"), "UTF-8"), ""));

Output:

 ~KARACHI¦~~~~~~
 ~KARACHI¦~~~~~~

See a description similiar to this problem here: Link

Solution 2

I have a simple one liner code, it removes for most of the non-UTF-8 characters. I tested for your character as well i.e. Â.

        String myString = "~KARACHI¦~~~~~~";
        String result = myString.replaceAll("[^\\x00-\\x7F]","");
        System.out.println(result);

You can find complete code here. You may test that as well here.

Share:
11,507
Ahmed Junaid
Author by

Ahmed Junaid

Updated on June 04, 2022

Comments

  • Ahmed Junaid
    Ahmed Junaid over 1 year

    I have the following string in which the  special character is coming in hidden. I want to remove only the  from this string ~IQBAL~KARACHI¦~~~~~~~~~~~.

    Here is a before and after image to show what I mean:

    enter image description here

    I've tried this code:

    responseMessageUTF.replaceAll("\\P{InBasic_Latin}", "");
    

    but this is also replacing the ¦ character. Is there any way to remove only the  character and not the ¦ character?

  • JB Nizet
    JB Nizet over 8 years
    What's the point of new String("Â".getBytes("UTF-8"), "UTF-8")? That's a very convoluted way of writing "Â". Also, replaceAll() expects a regex. replace() is more appropriate.
  • MrT
    MrT over 8 years
    I use this if i specify the charset for specific characters i have to parse. Its maybe not the easiest solution, but it works for me in my code.
  • JB Nizet
    JB Nizet over 8 years
    Characters don't have a charset. A charset is used to transform characters to bytes and vice versa. No such transformation is needed here. You just want to remove characters from a String, which is a sequence of characters. blub.replace("Â", "") works just as fine.
  • MrT
    MrT over 8 years
    You are right. your solution works well too and is much easier!
  • Jon Skeet
    Jon Skeet over 8 years
    Given the question, I don't think this is the right approach - it does answer the question asked, but doesn't address the underlying problem of why the rogue character is there to start with. (And your conversion of the string into bytes and back again is utterly pointless.) The OP should work out what the data is meant to represent, and find out why it's not being read properly.