Best way to escape characters like newline and double-quote in NSString

27,384

Solution 1

I don't think there is any built-in method to "escape" a particular set of characters.

If the characters you wish to escape is well-defined, I'd probably stick with the simple solution you proposed, replacing the instances of the characters crudely.

Be warned that if your source string already has escaped characters in it, then you'll probably want to avoid "double-escaping" them. One way of achieving this would be to go through and "unescape" any escaped character strings in the string before then escaping them all again.

If you need to support a variable set of escaped characters, take a look at the NSScanner methods "scanUpToCharactersFromSet:intoString:" and "scanCharactersFromSet:intoString:". You could use these methods on NSScanner to cruise through a string, copying the parts from the "scanUpTo" section into a mutable string unchanged, and copying the parts from a particular character set only after escaping them.

Solution 2

stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding

Solution 3

I think in cases like these, it's useful to operate on a character at a time, either in UniChars or UTF8 bytes. If you're using UTF-8, then vis(3) will do most of the work for you (see below). Can I ask why you want to escape a single-quote within a double-quoted string? How are you planning to handle multi-byte characters? In the example below, I'm using UTF-8, encoding 8-bit characters using C-Style octal escapes. This can also be undone by unvis(3).

#import <Foundation/Foundation.h>
#import <vis.h>

@interface NSString (Escaping)

- (NSString *)stringByEscapingMetacharacters;

@end

@implementation NSString (Escaping)

- (NSString *)stringByEscapingMetacharacters
{
    const char *UTF8Input = [self UTF8String];
    char *UTF8Output = [[NSMutableData dataWithLength:strlen(UTF8Input) * 4 + 1 /* Worst case */] mutableBytes];
    char ch, *och = UTF8Output;

    while ((ch = *UTF8Input++))
        if (ch == '\'' || ch == '\'' || ch == '\\' || ch == '"')
        {
            *och++ = '\\';
            *och++ = ch;
        }
        else if (isascii(ch))
            och = vis(och, ch, VIS_NL | VIS_TAB | VIS_CSTYLE, *UTF8Input);
        else
            och+= sprintf(och, "\\%03hho", ch);

    return [NSString stringWithUTF8String:UTF8Output];
}

@end

int
main(int argc, const char *argv[])
{
    NSAutoreleasePool *pool = [NSAutoreleasePool new];

    NSLog(@"%@", [@"I said \"Hello, world!\".\nHe said \"My name's not World.\"" stringByEscapingMetacharacters]);

    [pool drain];
    return 0;
}

Solution 4

This is a snippet I have used in the past that works quite well:

- (NSString *)escapeString:(NSString *)aString
{
    NSMutableString *returnString = [[NSMutableString alloc] init];

    for(int i = 0; i < [aString length]; i++) {

        unichar c = [aString characterAtIndex:i];

        // if char needs to be escaped
        if((('\\' == c) || ('\'' == c)) || ('"' == c)) {
            [returnString appendFormat:@"\\%c", c];            
        } else {
            [returnString appendFormat:@"%c", c];
        }
    }

    return [returnString autorelease];   
}

Solution 5

Do this:

NSString * encodedString = (NSString *)CFURLCreateStringByAddingPercentEscapes(
    NULL,
    (CFStringRef)unencodedString,
    NULL,
    (CFStringRef)@"!*'();:@&=+$,/?%#[]",
    kCFStringEncodingUTF8 );

Reference: http://simonwoodside.com/weblog/2009/4/22/how_to_really_url_encode/

Share:
27,384
TOMKA
Author by

TOMKA

Choo choo!

Updated on July 09, 2022

Comments

  • TOMKA
    TOMKA almost 2 years

    Say I have an NSString (or NSMutableString) containing:

    I said "Hello, world!".
    He said "My name's not World."
    

    What's the best way to turn that into:

    I said \"Hello, world!\".\nHe said \"My name\'s not World.\"
    

    Do I have to manually use -replaceOccurrencesOfString:withString: over and over to escape characters, or is there an easier way? These strings may contain characters from other alphabets/languages.

    How is this done in other languages with other string classes?

  • TOMKA
    TOMKA about 15 years
    It's a lot more complicated than I thought it would ever have to be, but it does the job well.
  • TOMKA
    TOMKA almost 14 years
    Those are percent escapes, I want backslash escapes.