Why are emoji characters like

后端 未结 6 1263
野趣味
野趣味 2020-11-28 00:14

The character

相关标签:
6条回答
  • 2020-11-28 00:37

    Swift 4.0 update

    String received lots of revisions in Swift 4 update, as documented in SE-0163. Two emoji are used for this demo representing two different structures. Both are combined with a sequence of emoji.

    0 讨论(0)
  • 2020-11-28 00:44

    This has to do with how the String type works in Swift, and how the contains(_:) method works.

    The '

    0 讨论(0)
  • 2020-11-28 00:46

    The other answers discuss what Swift does, but don't go into much detail about why.

    Do you expect “Å” to equal “Å”? I expect you would.

    One of these is a letter with a combiner, the other is a single composed character. You can add many different combiners to a base character, and a human would still consider it to be a single character. To deal with this sort of discrepancy the concept of a grapheme was created to represent what a human would consider a character regardless of the codepoints used.

    Now text messaging services have been combining characters into graphical emoji for years :) → 

    0 讨论(0)
  • 2020-11-28 00:59

    The first problem is you're bridging to Foundation with contains (Swift's String is not a Collection), so this is NSString behavior, which I don't believe handles composed Emoji as powerfully as Swift. That said, Swift I believe is implementing Unicode 8 right now, which also needed revision around this situation in Unicode 10 (so this may all change when they implement Unicode 10; I haven't dug into whether it will or not).

    To simplify thing, let's get rid of Foundation, and use Swift, which provides views that are more explicit. We'll start with characters:

    "                                                                    
    0 讨论(0)
  • 2020-11-28 01:01

    Emojis, much like the unicode standard, are deceptively complicated. Skin tones, genders, jobs, groups of people, zero-width joiner sequences, flags (2 character unicode) and other complications can make emoji parsing messy. A Christmas Tree, a Slice of Pizza, or a Pile of Poop can all be represented with a single Unicode code point. Not to mention that when new emojis are introduced, there is a delay between iOS support and emoji release. That and the fact that different versions of iOS support different versions of the unicode standard.

    TL;DR. I have worked on these features and opened sourced a library I am the author for JKEmoji to help parse strings with emojis. It makes parsing as easy as:

    print("I love these emojis                                                                     
    0 讨论(0)
  • 2020-11-28 01:03

    It seems that Swift considers a ZWJ to be an extended grapheme cluster with the character immediately preceding it. We can see this when mapping the array of characters to their unicodeScalars:

    Array(manual.characters).map { $0.description.unicodeScalars }
    

    This prints the following from LLDB:

    ▿ 4 elements
      ▿ 0 : StringUnicodeScalarView("                                                                    
    0 讨论(0)
提交回复
热议问题