Strip Non-Alphanumeric Characters from an NSString

34,397

Solution 1

We can do this by splitting and then joining. Requires OS X 10.5+ for the componentsSeparatedByCharactersInSet:

NSCharacterSet *charactersToRemove = [[NSCharacterSet alphanumericCharacterSet] invertedSet];
NSString *strippedReplacement = [[someString componentsSeparatedByCharactersInSet:charactersToRemove] componentsJoinedByString:@""];

Solution 2

In Swift, the componentsJoinedByString is replaced by join(...), so here it just replaces non-alphanumeric characters with a space.

let charactersToRemove = NSCharacterSet.alphanumericCharacterSet().invertedSet
let strippedReplacement = " ".join(someString.componentsSeparatedByCharactersInSet(charactersToRemove))

For Swift2 ...

var enteredByUser = field.text .. or whatever

let unsafeChars = NSCharacterSet.alphanumericCharacterSet().invertedSet

enteredByUser = enteredByUser
         .componentsSeparatedByCharactersInSet(unsafeChars)
         .joinWithSeparator("")

If you want to delete just the one character, for example delete all returns...

 enteredByUser = enteredByUser
         .componentsSeparatedByString("\n")
         .joinWithSeparator("")

Solution 3

What I wound up doing was creating an NSCharacterSet and the -invertedSet method that I found (it's a wonder what an extra hour of sleep does for documentation-reading abilities). Here's the code snippet, assuming that someString is the string from which you want to remove non-alphanumeric characters:

NSCharacterSet *charactersToRemove =
[[ NSCharacterSet alphanumericCharacterSet ] invertedSet ];

NSString *trimmedReplacement =
[ someString stringByTrimmingCharactersInSet:charactersToRemove ];

trimmedReplacement will then contain someString's alphanumeric characters.

Solution 4

Swift 3 version of accepted answer:

let unsafeChars = CharacterSet.alphanumerics.inverted
let myStrippedString = myString.components(separatedBy: unsafeChars).joined(separator: "")

Solution 5

A Cleanup Category

I have a method call stringByStrippingCharactersInSet: and stringByCollapsingWhitespace that might be convenient to just drop-in.

@implementation NSString (Cleanup)

- (NSString *)clp_stringByStrippingCharactersInSet:(NSCharacterSet *)set
{
    return [[self componentsSeparatedByCharactersInSet:set] componentsJoinedByString:@""];
}

- (NSString *)clp_stringByCollapsingWhitespace
{
    NSArray *components = [self componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
    components = [components filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"self <> ''"]];

    return [components componentsJoinedByString:@" "];
}

@end
Share:
34,397
Dushi Fdz
Author by

Dushi Fdz

Updated on July 18, 2022

Comments

  • Dushi Fdz
    Dushi Fdz almost 2 years

    I'm looking for a quick and easy way to strip non-alphanumeric characters from an NSString. Probably something using an NSCharacterSet, but I'm tired and nothing seems to return a string containing only the alphanumeric characters in a string.

  • Ken Aspeslagh
    Ken Aspeslagh over 14 years
    FYI, stringByTrimmingCharactersInSet: only removes characters from the beginning and end of the string. Maybe that's what you wanted.
  • Dushi Fdz
    Dushi Fdz over 14 years
    Hmm, good point, Ken. I didn't know that. It still works for my needs, but that's good to know.
  • Erik
    Erik over 11 years
    What are alphanumeric characters? E.g. would German "Umlaute", like ä, ö or ü be included in the set and hence not be trimmed?
  • Greg Fodor
    Greg Fodor almost 11 years
    To handle accented characters you need to create a NSMutableCharacterSet that is a union of alphanumericCharacterSet and nonBaseCharacterSet, and invert that
  • SwiftArchitect
    SwiftArchitect almost 9 years
    The trimmedReplacement is misleading. In all iOS NSString invocations, trimmed means from start and end. May I suggest occurrencesReplacement or strippedReplacement instead?
  • Klaas
    Klaas over 8 years
    "".join(componentsSeparatedByCharactersInSet(set)) is even better.
  • dy_
    dy_ over 8 years
    @Erik, umlauts would be included. that makes it unusable for filenames :(
  • Erik
    Erik over 8 years
    @datayeah No worries, just change the first line to invert the 'Portable Filename Character Set' as per pubs.opengroup.org/onlinepubs/9699919799/basedefs/…: NSCharacterSet *charactersToRemove = [[NSCharacterSet characterSetWithCharactersInString:@"ABCDEFGHIJKLMNOPQRSTUVW‌​XYZabcdefghijklmnopq‌​rstuvwxyz0123456789.‌​_-"] invertedSet];