Find out if Character in String is emoji?

73,443

Solution 1

What I stumbled upon is the difference between characters, unicode scalars and glyphs.

For example, the glyph ๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘งโ€๐Ÿ‘ง consists of 7 unicode scalars:

  • Four emoji characters: ๐Ÿ‘จ๐Ÿ‘ฉ๐Ÿ‘ง๐Ÿ‘ง
  • In between each emoji is a special character, which works like character glue; see the specs for more info

Another example, the glyph ๐Ÿ‘Œ๐Ÿฟ consists of 2 unicode scalars:

  • The regular emoji: ๐Ÿ‘Œ
  • A skin tone modifier: ๐Ÿฟ

Last one, the glyph 1๏ธโƒฃ contains three unicode characters:

So when rendering the characters, the resulting glyphs really matter.

Swift 5.0 and above makes this process much easier and gets rid of some guesswork we needed to do. Unicode.Scalar's new Property type helps is determine what we're dealing with. However, those properties only make sense when checking the other scalars within the glyph. This is why we'll be adding some convenience methods to the Character class to help us out.

For more detail, I wrote an article explaining how this works.

For Swift 5.0, this leaves you with the following result:

extension Character {
    /// A simple emoji is one scalar and presented to the user as an Emoji
    var isSimpleEmoji: Bool {
        guard let firstScalar = unicodeScalars.first else { return false }
        return firstScalar.properties.isEmoji && firstScalar.value > 0x238C
    }

    /// Checks if the scalars will be merged into an emoji
    var isCombinedIntoEmoji: Bool { unicodeScalars.count > 1 && unicodeScalars.first?.properties.isEmoji ?? false }

    var isEmoji: Bool { isSimpleEmoji || isCombinedIntoEmoji }
}

extension String {
    var isSingleEmoji: Bool { count == 1 && containsEmoji }

    var containsEmoji: Bool { contains { $0.isEmoji } }

    var containsOnlyEmoji: Bool { !isEmpty && !contains { !$0.isEmoji } }

    var emojiString: String { emojis.map { String($0) }.reduce("", +) }

    var emojis: [Character] { filter { $0.isEmoji } }

    var emojiScalars: [UnicodeScalar] { filter { $0.isEmoji }.flatMap { $0.unicodeScalars } }
}

Which will give you the following results:

"Aฬ›อšฬ–".containsEmoji // false
"3".containsEmoji // false
"Aฬ›อšฬ–โ–ถ๏ธ".unicodeScalars // [65, 795, 858, 790, 9654, 65039]
"Aฬ›อšฬ–โ–ถ๏ธ".emojiScalars // [9654, 65039]
"3๏ธโƒฃ".isSingleEmoji // true
"3๏ธโƒฃ".emojiScalars // [51, 65039, 8419]
"๐Ÿ‘Œ๐Ÿฟ".isSingleEmoji // true
"๐Ÿ™Ž๐Ÿผโ€โ™‚๏ธ".isSingleEmoji // true
"๐Ÿ‡น๐Ÿ‡ฉ".isSingleEmoji // true
"โฐ".isSingleEmoji // true
"๐ŸŒถ".isSingleEmoji // true
"๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".isSingleEmoji // true
"๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ".isSingleEmoji // true
"๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ".containsOnlyEmoji // true
"๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".containsOnlyEmoji // true
"Hello ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".containsOnlyEmoji // false
"Hello ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".containsEmoji // true
"๐Ÿ‘ซ Hรฉllo ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".emojiString // "๐Ÿ‘ซ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง"
"๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".count // 1

"๐Ÿ‘ซ Hรฉllล“ ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".emojiScalars // [128107, 128104, 8205, 128105, 8205, 128103, 8205, 128103]
"๐Ÿ‘ซ Hรฉllล“ ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".emojis // ["๐Ÿ‘ซ", "๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง"]
"๐Ÿ‘ซ Hรฉllล“ ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง".emojis.count // 2

"๐Ÿ‘ซ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ฆ".isSingleEmoji // false
"๐Ÿ‘ซ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ‘จโ€๐Ÿ‘จโ€๐Ÿ‘ฆ".containsOnlyEmoji // true

For older Swift versions, check out this gist containing my old code.

Solution 2

The simplest, cleanest, and swiftiest way to accomplish this is to simply check the Unicode code points for each character in the string against known emoji and dingbats ranges, like so:

extension String {

    var containsEmoji: Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x1F600...0x1F64F, // Emoticons
                 0x1F300...0x1F5FF, // Misc Symbols and Pictographs
                 0x1F680...0x1F6FF, // Transport and Map
                 0x2600...0x26FF,   // Misc symbols
                 0x2700...0x27BF,   // Dingbats
                 0xFE00...0xFE0F,   // Variation Selectors
                 0x1F900...0x1F9FF, // Supplemental Symbols and Pictographs
                 0x1F1E6...0x1F1FF: // Flags
                return true
            default:
                continue
            }
        }
        return false
    }

}

Solution 3

Swift 5.0

โ€ฆ introduced a new way of checking exactly this!

You have to break your String into its Scalars. Each Scalar has a Property value which supports the isEmoji value!

Actually you can even check if the Scalar is a Emoji modifier or more. Check out Apple's documentation: https://developer.apple.com/documentation/swift/unicode/scalar/properties

You may want to consider checking for isEmojiPresentation instead of isEmoji, because Apple states the following for isEmoji:

This property is true for scalars that are rendered as emoji by default and also for scalars that have a non-default emoji rendering when followed by U+FE0F VARIATION SELECTOR-16. This includes some scalars that are not typically considered to be emoji.


This way actually splits up Emoji's into all the modifiers, but it is way simpler to handle. And as Swift now counts Emoji's with modifiers (e.g.: ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ, ๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป, ๐Ÿด) as 1 you can do all kind of stuff.

var string = "๐Ÿค“ test"

for scalar in string.unicodeScalars {
    let isEmoji = scalar.properties.isEmoji

    print("\(scalar.description) \(isEmoji)"))
}

// ๐Ÿค“ true
//   false
// t false
// e false
// s false
// t false

NSHipster points out an interesting way to get all Emoji's:

import Foundation

var emoji = CharacterSet()

for codePoint in 0x0000...0x1F0000 {
    guard let scalarValue = Unicode.Scalar(codePoint) else {
        continue
    }

    // Implemented in Swift 5 (SE-0221)
    // https://github.com/apple/swift-evolution/blob/master/proposals/0221-character-properties.md
    if scalarValue.properties.isEmoji {
        emoji.insert(scalarValue)
    }
}

Solution 4

With Swift 5 you can now inspect the unicode properties of each character in your string. This gives us the convenient isEmoji variable on each letter. The problem is isEmoji will return true for any character that can be converted into a 2-byte emoji, such as 0-9.

We can look at the variable isEmoji and also check the for the presence of an emoji modifier to determine if the ambiguous characters will display as an emoji.

This solution should be much more future proof than the regex solutions offered here.

extension String {
    func containsOnlyEmojis() -> Bool {
        if count == 0 {
            return false
        }
        for character in self {
            if !character.isEmoji {
                return false
            }
        }
        return true
    }
    
    func containsEmoji() -> Bool {
        for character in self {
            if character.isEmoji {
                return true
            }
        }
        return false
    }
}

extension Character {
    // An emoji can either be a 2 byte unicode character or a normal UTF8 character with an emoji modifier
    // appended as is the case with 3๏ธโƒฃ. 0x238C is the first instance of UTF16 emoji that requires no modifier.
    // `isEmoji` will evaluate to true for any character that can be turned into an emoji by adding a modifier
    // such as the digit "3". To avoid this we confirm that any character below 0x238C has an emoji modifier attached
    var isEmoji: Bool {
        guard let scalar = unicodeScalars.first else { return false }
        return scalar.properties.isEmoji && (scalar.value > 0x238C || unicodeScalars.count > 1)
    }
}

Giving us

"hey".containsEmoji() //false

"Hello World ๐Ÿ˜Ž".containsEmoji() //true
"Hello World ๐Ÿ˜Ž".containsOnlyEmojis() //false

"3".containsEmoji() //false
"3๏ธโƒฃ".containsEmoji() //true

Solution 5

extension String {
    func containsEmoji() -> Bool {
        for scalar in unicodeScalars {
            switch scalar.value {
            case 0x3030, 0x00AE, 0x00A9,// Special Characters
            0x1D000...0x1F77F,          // Emoticons
            0x2100...0x27BF,            // Misc symbols and Dingbats
            0xFE00...0xFE0F,            // Variation Selectors
            0x1F900...0x1F9FF:          // Supplemental Symbols and Pictographs
                return true
            default:
                continue
            }
        }
        return false
    }
}

This is my fix, with updated ranges.

Share:
73,443

Related videos on Youtube

Andrew
Author by

Andrew

Updated on December 16, 2021

Comments

  • Andrew
    Andrew over 2 years

    I need to find out whether a character in a string is an emoji.

    For example, I have this character:

    let string = "๐Ÿ˜€"
    let character = Array(string)[0]
    

    I need to find out if that character is an emoji.

    • Martin R
      Martin R almost 9 years
      I am curious: why do you need that information?
    • Martin R
      Martin R almost 9 years
      @EricD.: There are many Unicode characters which take more than one UTF-8 code point (e.g. "โ‚ฌ" = E2 82 AC) or more than one UTF-16 code point (e.g. "๐„ž" =D834 DD1E).
    • Ashish Kakkad
      Ashish Kakkad almost 9 years
      Hope you will got an idea from this obj-c version of code stackoverflow.com/questions/19886642/โ€ฆ
    • Paul B
      Paul B over 4 years
      Strings have their indexing which is a preferred way of using them. To get a particular character (or grapheme cluster rather) you could: let character = string[string.index(after: string.startIndex)] or let secondCharacter = string[string.index(string.startIndex, offsetBy: 1)]
  • thefaj
    thefaj about 8 years
    A code example like this is way better than suggesting to include a third party library dependency. Shardul's answer is unwise advice to followโ€”always write your own code.
  • Shawn Throop
    Shawn Throop about 8 years
    This is great, thank you for commenting what the cases pertain to
  • Cue
    Cue almost 8 years
    Like so much your code, I implemented it in an answer here. A thing I noticed is that it miss some emoji, maybe because they are not part of the categories you listed, for example this one: Robot Face emoji ๐Ÿค–
  • Frizlab
    Frizlab almost 8 years
    @Tel I guess it would be the range 0x1F900...0x1F9FF (per Wikipedia). Not sure all of the range should be considered emoji.
  • Admin
    Admin over 7 years
    This is what i am looking for, Thanks JAL
  • Tim Bull
    Tim Bull over 7 years
    This is by far the best and most correct answer here. Thank you! One small note, your examples don't match the code (you renamed containsOnlyEmoki to containsEmoji in the snippet - I presume because it's more correct, in my testing it returned true for strings with mixed characters).
  • Kevin R
    Kevin R over 7 years
    Thanks for pointing that out. I forgot to add some code to the example. I added the containsOnlyEmoji function. This one does check if the string only consists of emoji's or zero width joiner.
  • Andrew
    Andrew over 7 years
    I'm getting an error on count under containsOnlhEmoji. Not sure what that value was supposed to be?
  • Kevin R
    Kevin R over 7 years
    My bad, I changed around some code, guess I messed up. I updated the example
  • Andrew
    Andrew over 7 years
    @KevinR Awesome. I've changed this to the best answer. Would there be a way to get an array of emoji strings in a string/emoji-only string?
  • Kevin R
    Kevin R over 7 years
    @Andrew: Sure, I added another method to the example to demonstrate this :).
  • Andrew
    Andrew over 7 years
    @KevinR Thanks. I think the result I'd be after would be that a string such as "๐Ÿ‘ซ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง๐Ÿ˜ฏ" would be separated into separate glyphs, resulting in ["๐Ÿ‘ซ", "๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง", "๐Ÿ˜ฏ"]. Do you know if that's possible?
  • Kevin R
    Kevin R over 7 years
    @Andrew this is where it gets really messy. I added an example how to do that. The problem is I have assume to know how CoreText will render the the glyphs by simply checking the characters. If anyone has suggestions for a cleaner method please let me know.
  • Andrew
    Andrew over 7 years
    @KevinR This is great. 1 issue i've noticed though, is that containsOnlyEmoji doesn't seem to work with some emoji, for example the one called 'smiling face' - โ˜บ๏ธ.
  • Kevin R
    Kevin R over 7 years
    @Andrew Thanks for pointing that out, I changed the way containsOnlyEmoji checks. I also updated the example to Swift 3.0.
  • Andrew
    Andrew over 7 years
    @KevinR So now that works, but as a result of calling emojis I now get ["โ˜บ๏ธ", ""], ie a blank second item.
  • Andrew
    Andrew over 7 years
    @KevinR I've noticed a number of problems here, trying to come up with a solution for this. On calling emojis with the smiling face emoji, it returns an array containing that emoji, plus a seemingly empty space. Comparing a string with the regular smiling face emoji to the first item in that given result returns false, on comparison. The empty-looking string has a character count of 1. And trying to fetch the range of the smiling face emoji from the emojis result, in the original given string, returns nil.
  • Andrey Chernukha
    Andrey Chernukha over 7 years
    Will this be rejected by Apple?
  • JAL
    JAL over 7 years
    @AndreyChernukha There's always a risk, but I haven't experienced any rejection yet.
  • netdigger
    netdigger about 7 years
    Where does the enum values come from?
  • QuangDT
    QuangDT about 7 years
    This code does not work for newer emojis and those with diversity options. If you are using Objective C, you can useenumerateSubstringsInRange passing in NSStringEnumerationByComposedCharacterSequences as the enumeration option.
  • Kevin R
    Kevin R about 7 years
    Hi @RunLoop; could you propose an edit to the answer to clarify that? I think a lot of people would benefit from that :).
  • QuangDT
    QuangDT about 7 years
    Hi @KevinR Thanks for your original, excellent answer - I still use it to detect whether the string does in fact contain an emoji before enumerating the substrings. You are more than welcome to incorporate my addition to your answer. :)
  • xaphod
    xaphod almost 7 years
    Never ever use private APIs. At best, the hurt will only come tomorrow. Or next month.
  • xaphod
    xaphod almost 7 years
    I had problems trying to filter out emoji from a string. Here's using Kevin's answer - thanks @KevinR for your answer. extension String { func stripEmoji() -> String { return self.unicodeScalars.filter({ $0.isEmoji == false }).map({ String($0) }).reduce("", +) } }
  • vikzilla
    vikzilla almost 7 years
    I notice containsOnlyEmoji() doesn't work in detecting the number emoji's 0๏ธโƒฃ through 9๏ธโƒฃ. Any ideas for a fix?
  • justColbs
    justColbs almost 7 years
    As @RunLoop stated, this does not work for newer emojis and those with diversity options. Has anyone found a Swift solution for this? I'm stumped.
  • Kevin R
    Kevin R almost 7 years
    @vikzilla This is because 0๏ธโƒฃ is really 3 characters, a 'normal' 0, a 'variation selector' (see: unicode-table.com/en/search/?q=65039) and a bounding box (see: unicode-table.com/en/search/?q=8419) since the first character is not a emoji. We'd have to use the superseding character to determine it's characteristics. I'm trying to find some time to add this, but feel free to suggest an edit:).
  • justColbs
    justColbs almost 7 years
    @KevinR It still returns the male version for some variation emojis. For example ["๐Ÿ•ต", "๏ธโ€โ™€", "๏ธ"] is returned when calling .emojis on "๐Ÿ•ต๏ธโ€โ™€๏ธ"
  • Kevin R
    Kevin R almost 7 years
    @justColbs The mentioned icon doesn't work on my iOS device, so I guess it's rather new? If you run this on playgrounds: let s = "๐Ÿ•ต๏ธโ€โ™€๏ธ".unicodeScalars.map({$0}) and inspect s, you see the involved characters and get a sense of the problem. You can copy each value of the array onto unicode-table.com search bar and see what it means. This one seems quite complicated ;).
  • justColbs
    justColbs almost 7 years
    @KevinR Yeah I performed the same test and noticed that. It's mostly the complex variation emojis that it has trouble with.
  • sudo
    sudo over 6 years
    I think these ranges are subject to change as the Unicode standard changes. Or at least I haven't seen anything suggesting they stay constant.
  • Kevin R
    Kevin R over 6 years
    @sudo certainly, as the spec expands, more ranges are added to these lists, thats one of the reasons these answers have different lists. If you catch any missing ranges, feel free to contribute :)
  • Warpzit
    Warpzit over 6 years
    Is it just me or has a lot of above been broken with swift 4? characters.count only gives 1 now with swift 4.
  • skyylex
    skyylex over 6 years
    There are still few more emojis which aren't recognized by the extension: ๐Ÿ†Ž ๐Ÿ†” ๐Ÿ†– ๐Ÿ†— โฐ ๐Ÿˆฒ ๐Ÿˆณ ๐Ÿˆด ๐Ÿˆถ ๐Ÿˆน P.S. I've checked on the "/System/Library/PrivateFrameworks/CoreEmoji.framework/Resou‌​rces/en.lproj/FindRe‌​place.strings" which seems to contain text description for most of emoji (all of them?)
  • Anton Shkurenko
    Anton Shkurenko about 6 years
    Broken in Swift 4
  • Kevin R
    Kevin R almost 6 years
    @AntonShkurenko seems to work fine for me, what seems to be the problem?
  • Ramon
    Ramon almost 6 years
    I like your thinking! ;) - Out of the box!
  • Sรธren Pedersen
    Sรธren Pedersen about 5 years
    Just a small note, this implementation is very slow at compiling. Checking with -Xfrontend -debug-time-function-bodies it's around 1800ms (so nearly two seconds)
  • d4Rk
    d4Rk about 5 years
    Why are you doing this to us? #apple #unicodestandard ๐Ÿ˜ฑ๐Ÿค”๐Ÿคช๐Ÿ™ˆ๐Ÿ˜ˆ๐Ÿค•๐Ÿ’ฉ
  • Albert Renshaw
    Albert Renshaw about 5 years
    I haven't looked at this in a while but I wonder if I have to convert to UIColor then to hsb; it seems I can just check that r,g,b all == 0? If someone tries let me know
  • Juan Carlos Ospina Gonzalez
    Juan Carlos Ospina Gonzalez about 5 years
    i like this solution, but won't it break with a character like โ„น ?
  • Albert Renshaw
    Albert Renshaw about 5 years
    @JuanCarlosOspinaGonzalez Nope, in emoji that renders as a blue box with a white i. It does bring up a good point though that the UILabel should force the font to be AppleColorEmoji, adding that in now as a fail safe, although I think Apple will default it for those anyways
  • A Springham
    A Springham almost 5 years
    Great answer, thanks. It's worth mentioning that your min sdk must be 10.2 to use this part of Swift 5. Also in order to check if a string was only made up of emojis I had to check if it had one of these properties: scalar.properties.isEmoji scalar.properties.isEmojiPresentation scalar.properties.isEmojiModifier scalar.properties.isEmojiModifierBase scalar.properties.isJoinControl scalar.properties.isVariationSelector
  • Miniroo
    Miniroo almost 5 years
    Beware, integers 0-9 are considered emojis. So "6".unicodeScalars.first!.properties.isEmoji will evaluate as true
  • Miniroo
    Miniroo almost 5 years
    How do you deal with "3" and "#" which both evaluate to true for isEmoji
  • Paul B
    Paul B over 4 years
    And what's more is Character("3๏ธโƒฃ").isEmoji // true while Character("3").isEmoji // false
  • vauxhall
    vauxhall over 4 years
    I added also the comparison: $0.properties.generalCategory == .otherSymbol to make it work for more emojis, like โฐ, ๐ŸŒถ, etc
  • Sparga
    Sparga over 4 years
    Thanks for this code! I noticed that flags are not detected has emojis because they are a combination of emojis without join control so I changed isSimpleEmoji implementation to be unicodeScalars.allSatisfy({ $0.properties.isEmojiPresentation }).
  • Kevin R
    Kevin R over 4 years
    @Sparga Thanks! I added your check to isCombinedIntoEmoji, also because some emoji like '๐ŸŒถ' would break otherwise.
  • Yuriy Pavlyshak
    Yuriy Pavlyshak over 4 years
    @KevinR Thanks for the great answer! I found another case when isCombinedIntoEmoji falls short, that is country subdivision flags like ๐Ÿด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ,๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ. I added another condition for this case which is ... || unicodeScalars.first!.properties.isEmojiPresentation && unicodeScalars.dropFirst().allSatisfy { $0.properties.generalCategory == .format } You can learn more on how these emojis are constructed here, see scotland flag example: blog.emojipedia.org/emoji-flags-explained
  • Kevin R
    Kevin R about 4 years
    @YuriyPavlyshak thanks! Sorry it took me a while to get around to it, but I updated the code!
  • boog
    boog almost 4 years
    NOTE that: 'isEmoji' is only available in iOS 10.2 or newer
  • trndjc
    trndjc over 3 years
    In your testing, would checking whether the character is ASCII be sufficient to checking whether its scalar value is greater than 0x238C?
  • Kevin R
    Kevin R over 3 years
    @bsod that might work, but officially only certain character blocks are assigned as emoji. Simplifying the check like that likely results in false positives. Furthermore: it's not guaranteed to keep working in the future, as more characters are added to the standard.
  • Jan
    Jan over 3 years
    There are other characters like # and * that will also return true for the isEmoji check. isEmojiPresentation seems to work better, at least it returns false for 0...9, #, * and any other symbol I could try on an English-US keyboard. Anyone has more experience with it and knows if it can be trusted for input validation?
  • Omar Masri
    Omar Masri over 3 years
    Thank you for saving me so much time
  • zh.
    zh. about 3 years
    โค๏ธ has two scalars. First scalar's isEmoji is true, but isEmojiPresentation is false. Second scalar's will only return true for isVariationSelector. So doesn't look like a straight forward way to understand what's an emoji ๐Ÿค”
  • goodliving
    goodliving almost 3 years
    It looks that โŒš is not marked as an emoji. I tested this emoji set: stackoverflow.com/a/60565823/1054550
  • Kevin R
    Kevin R almost 3 years
    yes, but also: "1".unicodeScalars.contains { $0.properties.isEmoji } // true
  • Christian Beer
    Christian Beer over 2 years
    Just an short info regarding your update: 65024... 65039 == 0xFE00...0xFE0F so that's doubled.
  • humblehacker
    humblehacker over 2 years
    from the docs: "testing isEmoji alone on a single scalar is insufficient to determine if a unit of text is rendered as an emoji; a correct test requires inspecting multiple scalars in a Character. In addition to checking whether the base scalar has isEmoji == true, you must also check its default presentation (see isEmojiPresentation) and determine whether it is followed by a variation selector that would modify the presentation."
  • Sreekuttan
    Sreekuttan over 2 years
    I have made the changes accordingly @humblehacker
  • jsbox
    jsbox about 2 years
    Why does your code point loop top out at 0x1F0000? The highest legal Unicode code point (scalar) value is 0x10FFFF. So in the above loop the guard statement and its unsuccessful attempts to construct a Unicode.Scaler() is continuing the loop unnecessarily 917,505 times. Or perhaps you meant break rather than continue. What am I missing?
  • jonchoi
    jonchoi almost 2 years
    Thank you. This is great. Is there an update range of new emojis?