How To Remove All Lines Containing Any non-ASCII Characters Using Notepad++ or Emeditor

16,808

[^\x00-\x7F] works fine, but, if you want to use a long character class like [^a-z0-9``~!@#$%^&*()-_=+[]{}\|;:'"<>,./?] you have to escape characters that have a special meaning (ie. -[]\ and add linebreak \r,\n.

Your regex becomes:

 [^a-z0-9``~!@#$%^&*()\-_=+\[\]{}\\|;:'"<>,./?\r\n]
 #                    ^    ^ ^   ^            ^^^^

  • Ctrl+H
  • Find what: [^a-z0-9``~!@#$%^&*()\-_=+\[\]{}\\|;:'"<>,./?\r\n]+$ But, again, [^\x00-\x7F] works fine and is more readable
  • Replace with: LEAVE EMPTY
  • check Wrap around
  • check Regular expression
  • Replace all

Result for given example:

0123456789`~!@#$%^&*()-_=+[]{}\|;:'"<>,./?
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Share:
16,808

Related videos on Youtube

DeathRival
Author by

DeathRival

Updated on September 18, 2022

Comments

  • DeathRival
    DeathRival almost 2 years

    How do I remove all lines containing any non-ASCII keyboard characters?

    I tried so many times Regular Expressions codes but none work like it should be I even tried this code [^\x00-\x7F]+ but it didn't select all the characters

    the idea come on my mind is to use this way [^a-z0-9``~!@#$%^&*()-_=+[]{}\|;:'"<>,./?] but still not work because some of this characters didn't get deselected like \ / | { } [ ] $ # ^ ( )

    1. If a line contains any characters not in the list below, I want to remove remove it or bookmark it

      0123456789`~!@#$%^&*()-_=+[]{}\/|;:'"<>,.?
      abcdefghijklmnopqrstuvwxyz
      ABCDEFGHIJKLMNOPQRSTUVWXYZ
      
    2. Simple example: There are more characters like this found here: https://en.wikipedia.org/wiki/List_of_Unicode_characters

      0123456789`~!@#$%^&*()-_=+[]{}\|;:'"<>,./?
      abcdefghijklmnopqrstuvwxyz
      ABCDEFGHIJKLMNOPQRSTUVWXYZ
      ¤©ª«¬¯°±²³´µ¶·¸¹º»¼½¾¿÷ÆIJŒœƔƕƋƕ
      ƜƝƢƸƾDžNJNjǽǾǼɁɀȾɎʒəɼʰʲʱʴʳʵʶʷʸˁˀˇˆ˟ˠ
      ˩˧Ͱͱͳʹͼͻͺ͵ͿΏΔΘΞΛΣΠΦΧΨΩΪΫάέήίΰαβδε
      θηκλμξπςρφχψωϊϋϏώϑϐϓϒϔϕϖϠϟϞϝϜϡϢ
      ϤϣϧϫϬϮϯϰϱ₠₡₢₣₤₥₦₧₨₩₪₫€₭₮₯₰₱₲
      ₳₴₵₶₷₸₹₺₻₼₽₾₿⅐⅑⅒⅓⅔⅕⅖⅗⅘⅙⅚⅛⅜
      ⅝⅞⅟℠℡™℣ℤ℥Ω℧ℨ℩KÅℬℭ℮ℯ⇀⇁ↀↁↂↃↄ
      ⇔⇕⇖⇗⇘⇙⇚⇛⇜⇝⇞⇟⇠⇡⇢⇣⇤⇥⇦⇧⇨⅀⅁⅂⅃⅄ⅅ
      ⅆⅇⅈⅉ⅊⅋⅌⅍ⅎ⅏ⅱⅲⅳⅴⅵⅶⅷⅸⅹⅺⅻⅼⅽ
      
    3. Expected result:

      0123456789`~!@#$%^&*()-_=+[]{}\|;:'"<>,./?
      abcdefghijklmnopqrstuvwxyz
      ABCDEFGHIJKLMNOPQRSTUVWXYZ
      
    • Toto
      Toto over 6 years
      [^\x00-\x7F]+ works fine for me in Notepad++, it gives the expected result. What is your version of Npp (here, I have 7.5.1)? Did you check Regular expression?
    • Seth
      Seth over 6 years
      Characters that are part of regular expressions (like [,],(,),#,^) need to be escaped. In Notepad++ you usually do this by prefixing them by a backslash. So [^a-z0-9``~!@#$%^&*()-_=+[]{}\|;:'"<>,./?] would become [\^a-z0-9``~!@\#\$%^&*\(\)-_=+\[\]{}\|;:'"<>,./?] (at east).
    • Toto
      Toto over 6 years
      @Seth: The caret ^ in first position of the character class means a negation, if you escape it, it means ... a caret also parenthesis, pipe and other characters don't need to be escape but the dash - must be escaped as it means a range of characters.
    • Seth
      Seth over 6 years
      @Toto Good point about the leading caret but you need to escape the others if you want to match them literally. This might be special for Notepad++ but with the above "simple example" it doesn't work if you don't escape them.
  • DeathRival
    DeathRival over 6 years
    i use only windows 7 and i have no option to use another windows only windows 2012 that i can use as well
  • chloesoe
    chloesoe over 6 years
    Ok, then unfortunately I can't help you. If you have physical access to your Computer and you have the possibility to change the boot order, then you could create a USB stick with a live Ubuntu and run that commands with that; see that tutorial tutorials.ubuntu.com/tutorial/…
  • DeathRival
    DeathRival over 6 years
    Toto Thanks so Much you always give a good answer and helpful and match what the question talk about thanks and btw i know [^\x00-\x7F] is work fine but not with every single special characters but the first code u did helped me out to keep only what i want thanks so much helpful
  • DeathRival
    DeathRival over 6 years
    thanks for your trying help but i said in my question that the code [^\x00-\x7F] don't remove everything that i need because there unknown special characters this code don't read it anyway Toto helped me out thanks for ur trying
  • miroxlav
    miroxlav over 6 years
    @DeathRival – no problem, for me, all the above steps worked 100%, turning sample #2 into #3. Of course, you can use what you did in accepted answer, but this one is faster and more effective. (I bet you did not try the steps above :)
  • Gabriel Devillers
    Gabriel Devillers over 3 years