When does using swapcase twice not return an identical answer?

后端 未结 3 1176
被撕碎了的回忆
被撕碎了的回忆 2021-01-03 18:57

The Python docs for str.swapcase() say:

Note that it is not necessarily true that s.swapcase().swapcase() == s.

I\'

相关标签:
3条回答
  • 2021-01-03 19:02

    I tried this

    v = lambda x: x.swapcase().swapcase() == x
    [unichr(x) for x in range(10000) if not v(unichr(x))]
    

    Which results in these:

    [u'\xb5', u'\u0130', u'\u0131', u'\u017f', u'\u03c2', u'\u03d0', u'\u03d1', u'\u03d5', u'\u03d6', u'\u03f0', u'\u03f1', u'\u03f4', u'\u03f5', u'\u1e9b', u'\u1e9e', u'\u1f80', u'\u1f81', u'\u1f82', u'\u1f83', u'\u1f84', u'\u1f85', u'\u1f86', u'\u1f87', u'\u1f90', u'\u1f91', u'\u1f92', u'\u1f93', u'\u1f94', u'\u1f95', u'\u1f96', u'\u1f97', u'\u1fa0', u'\u1fa1', u'\u1fa2', u'\u1fa3', u'\u1fa4', u'\u1fa5', u'\u1fa6', u'\u1fa7', u'\u1fb3', u'\u1fbe', u'\u1fc3', u'\u1ff3', u'\u2126', u'\u212a', u'\u212b']
    
    0 讨论(0)
  • 2021-01-03 19:21

    While Volatility brought up the example of the uppercase mu and uppercase micro resolving to the same Unicode codepoint, here's another interesting situation where applying swapcase twice results in a different answer:

    >>> 'ß'.swapcase().swapcase()
    'ss'
    

    Confused? The German consonant ß (pronounced [s]) becomes SS after one application of swapcase and then ss after the second.

    Here's the whole list of them (→ represents one swapcase):

    µ (0xb5) → Μ (0x39c) → μ (0x3bc) → Μ (0x39c)
    ß (0xdf) → SS (0x5353) → ss (0x7373) → SS (0x5353)
    İ (0x130) → i̇ (0x69307) → İ (0x49307) → i̇ (0x69307)
    ı (0x131) → I (0x49) → i (0x69) → I (0x49)
    ʼn (0x149) → ʼN (0x2bc4e) → ʼn (0x2bc6e) → ʼN (0x2bc4e)
    ſ (0x17f) → S (0x53) → s (0x73) → S (0x53)
    ǰ (0x1f0) → J̌ (0x4a30c) → ǰ (0x6a30c) → J̌ (0x4a30c)
    ͅ (0x345) → Ι (0x399) → ι (0x3b9) → Ι (0x399)
    ΐ (0x390) → Ϊ́ (0x399308301) → ΐ (0x3b9308301) → Ϊ́ (0x399308301)
    ΰ (0x3b0) → Ϋ́ (0x3a5308301) → ΰ (0x3c5308301) → Ϋ́ (0x3a5308301)
    ς (0x3c2) → Σ (0x3a3) → σ (0x3c3) → Σ (0x3a3)
    ϐ (0x3d0) → Β (0x392) → β (0x3b2) → Β (0x392)
    ϑ (0x3d1) → Θ (0x398) → θ (0x3b8) → Θ (0x398)
    ϕ (0x3d5) → Φ (0x3a6) → φ (0x3c6) → Φ (0x3a6)
    ϖ (0x3d6) → Π (0x3a0) → π (0x3c0) → Π (0x3a0)
    ϰ (0x3f0) → Κ (0x39a) → κ (0x3ba) → Κ (0x39a)
    ϱ (0x3f1) → Ρ (0x3a1) → ρ (0x3c1) → Ρ (0x3a1)
    ϴ (0x3f4) → θ (0x3b8) → Θ (0x398) → θ (0x3b8)
    ϵ (0x3f5) → Ε (0x395) → ε (0x3b5) → Ε (0x395)
    և (0x587) → ԵՒ (0x535552) → եւ (0x565582) → ԵՒ (0x535552)
    ẖ (0x1e96) → H̱ (0x48331) → ẖ (0x68331) → H̱ (0x48331)
    ẗ (0x1e97) → T̈ (0x54308) → ẗ (0x74308) → T̈ (0x54308)
    ẘ (0x1e98) → W̊ (0x5730a) → ẘ (0x7730a) → W̊ (0x5730a)
    ẙ (0x1e99) → Y̊ (0x5930a) → ẙ (0x7930a) → Y̊ (0x5930a)
    ẚ (0x1e9a) → Aʾ (0x412be) → aʾ (0x612be) → Aʾ (0x412be)
    ẛ (0x1e9b) → Ṡ (0x1e60) → ṡ (0x1e61) → Ṡ (0x1e60)
    ẞ (0x1e9e) → ß (0xdf) → SS (0x5353) → ss (0x7373) → SS (0x5353)
    ὐ (0x1f50) → Υ̓ (0x3a5313) → ὐ (0x3c5313) → Υ̓ (0x3a5313)
    ὒ (0x1f52) → Υ̓̀ (0x3a5313300) → ὒ (0x3c5313300) → Υ̓̀ (0x3a5313300)
    ὔ (0x1f54) → Υ̓́ (0x3a5313301) → ὔ (0x3c5313301) → Υ̓́ (0x3a5313301)
    ὖ (0x1f56) → Υ̓͂ (0x3a5313342) → ὖ (0x3c5313342) → Υ̓͂ (0x3a5313342)
    ᾀ (0x1f80) → ἈΙ (0x1f08399) → ἀι (0x1f003b9) → ἈΙ (0x1f08399)
    ᾁ (0x1f81) → ἉΙ (0x1f09399) → ἁι (0x1f013b9) → ἉΙ (0x1f09399)
    ᾂ (0x1f82) → ἊΙ (0x1f0a399) → ἂι (0x1f023b9) → ἊΙ (0x1f0a399)
    ᾃ (0x1f83) → ἋΙ (0x1f0b399) → ἃι (0x1f033b9) → ἋΙ (0x1f0b399)
    ᾄ (0x1f84) → ἌΙ (0x1f0c399) → ἄι (0x1f043b9) → ἌΙ (0x1f0c399)
    ᾅ (0x1f85) → ἍΙ (0x1f0d399) → ἅι (0x1f053b9) → ἍΙ (0x1f0d399)
    ᾆ (0x1f86) → ἎΙ (0x1f0e399) → ἆι (0x1f063b9) → ἎΙ (0x1f0e399)
    ᾇ (0x1f87) → ἏΙ (0x1f0f399) → ἇι (0x1f073b9) → ἏΙ (0x1f0f399)
    ᾐ (0x1f90) → ἨΙ (0x1f28399) → ἠι (0x1f203b9) → ἨΙ (0x1f28399)
    ᾑ (0x1f91) → ἩΙ (0x1f29399) → ἡι (0x1f213b9) → ἩΙ (0x1f29399)
    ᾒ (0x1f92) → ἪΙ (0x1f2a399) → ἢι (0x1f223b9) → ἪΙ (0x1f2a399)
    ᾓ (0x1f93) → ἫΙ (0x1f2b399) → ἣι (0x1f233b9) → ἫΙ (0x1f2b399)
    ᾔ (0x1f94) → ἬΙ (0x1f2c399) → ἤι (0x1f243b9) → ἬΙ (0x1f2c399)
    ᾕ (0x1f95) → ἭΙ (0x1f2d399) → ἥι (0x1f253b9) → ἭΙ (0x1f2d399)
    ᾖ (0x1f96) → ἮΙ (0x1f2e399) → ἦι (0x1f263b9) → ἮΙ (0x1f2e399)
    ᾗ (0x1f97) → ἯΙ (0x1f2f399) → ἧι (0x1f273b9) → ἯΙ (0x1f2f399)
    ᾠ (0x1fa0) → ὨΙ (0x1f68399) → ὠι (0x1f603b9) → ὨΙ (0x1f68399)
    ᾡ (0x1fa1) → ὩΙ (0x1f69399) → ὡι (0x1f613b9) → ὩΙ (0x1f69399)
    ᾢ (0x1fa2) → ὪΙ (0x1f6a399) → ὢι (0x1f623b9) → ὪΙ (0x1f6a399)
    ᾣ (0x1fa3) → ὫΙ (0x1f6b399) → ὣι (0x1f633b9) → ὫΙ (0x1f6b399)
    ᾤ (0x1fa4) → ὬΙ (0x1f6c399) → ὤι (0x1f643b9) → ὬΙ (0x1f6c399)
    ᾥ (0x1fa5) → ὭΙ (0x1f6d399) → ὥι (0x1f653b9) → ὭΙ (0x1f6d399)
    ᾦ (0x1fa6) → ὮΙ (0x1f6e399) → ὦι (0x1f663b9) → ὮΙ (0x1f6e399)
    ᾧ (0x1fa7) → ὯΙ (0x1f6f399) → ὧι (0x1f673b9) → ὯΙ (0x1f6f399)
    ᾲ (0x1fb2) → ᾺΙ (0x1fba399) → ὰι (0x1f703b9) → ᾺΙ (0x1fba399)
    ᾳ (0x1fb3) → ΑΙ (0x391399) → αι (0x3b13b9) → ΑΙ (0x391399)
    ᾴ (0x1fb4) → ΆΙ (0x386399) → άι (0x3ac3b9) → ΆΙ (0x386399)
    ᾶ (0x1fb6) → Α͂ (0x391342) → ᾶ (0x3b1342) → Α͂ (0x391342)
    ᾷ (0x1fb7) → Α͂Ι (0x391342399) → ᾶι (0x3b13423b9) → Α͂Ι (0x391342399)
    ι (0x1fbe) → Ι (0x399) → ι (0x3b9) → Ι (0x399)
    ῂ (0x1fc2) → ῊΙ (0x1fca399) → ὴι (0x1f743b9) → ῊΙ (0x1fca399)
    ῃ (0x1fc3) → ΗΙ (0x397399) → ηι (0x3b73b9) → ΗΙ (0x397399)
    ῄ (0x1fc4) → ΉΙ (0x389399) → ήι (0x3ae3b9) → ΉΙ (0x389399)
    ῆ (0x1fc6) → Η͂ (0x397342) → ῆ (0x3b7342) → Η͂ (0x397342)
    ῇ (0x1fc7) → Η͂Ι (0x397342399) → ῆι (0x3b73423b9) → Η͂Ι (0x397342399)
    ῒ (0x1fd2) → Ϊ̀ (0x399308300) → ῒ (0x3b9308300) → Ϊ̀ (0x399308300)
    ΐ (0x1fd3) → Ϊ́ (0x399308301) → ΐ (0x3b9308301) → Ϊ́ (0x399308301)
    ῖ (0x1fd6) → Ι͂ (0x399342) → ῖ (0x3b9342) → Ι͂ (0x399342)
    ῗ (0x1fd7) → Ϊ͂ (0x399308342) → ῗ (0x3b9308342) → Ϊ͂ (0x399308342)
    ῢ (0x1fe2) → Ϋ̀ (0x3a5308300) → ῢ (0x3c5308300) → Ϋ̀ (0x3a5308300)
    ΰ (0x1fe3) → Ϋ́ (0x3a5308301) → ΰ (0x3c5308301) → Ϋ́ (0x3a5308301)
    ῤ (0x1fe4) → Ρ̓ (0x3a1313) → ῤ (0x3c1313) → Ρ̓ (0x3a1313)
    ῦ (0x1fe6) → Υ͂ (0x3a5342) → ῦ (0x3c5342) → Υ͂ (0x3a5342)
    ῧ (0x1fe7) → Ϋ͂ (0x3a5308342) → ῧ (0x3c5308342) → Ϋ͂ (0x3a5308342)
    ῲ (0x1ff2) → ῺΙ (0x1ffa399) → ὼι (0x1f7c3b9) → ῺΙ (0x1ffa399)
    ῳ (0x1ff3) → ΩΙ (0x3a9399) → ωι (0x3c93b9) → ΩΙ (0x3a9399)
    ῴ (0x1ff4) → ΏΙ (0x38f399) → ώι (0x3ce3b9) → ΏΙ (0x38f399)
    ῶ (0x1ff6) → Ω͂ (0x3a9342) → ῶ (0x3c9342) → Ω͂ (0x3a9342)
    ῷ (0x1ff7) → Ω͂Ι (0x3a9342399) → ῶι (0x3c93423b9) → Ω͂Ι (0x3a9342399)
    Ω (0x2126) → ω (0x3c9) → Ω (0x3a9) → ω (0x3c9)
    K (0x212a) → k (0x6b) → K (0x4b) → k (0x6b)
    Å (0x212b) → å (0xe5) → Å (0xc5) → å (0xe5)
    ff (0xfb00) → FF (0x4646) → ff (0x6666) → FF (0x4646)
    fi (0xfb01) → FI (0x4649) → fi (0x6669) → FI (0x4649)
    fl (0xfb02) → FL (0x464c) → fl (0x666c) → FL (0x464c)
    ffi (0xfb03) → FFI (0x464649) → ffi (0x666669) → FFI (0x464649)
    ffl (0xfb04) → FFL (0x46464c) → ffl (0x66666c) → FFL (0x46464c)
    ſt (0xfb05) → ST (0x5354) → st (0x7374) → ST (0x5354)
    st (0xfb06) → ST (0x5354) → st (0x7374) → ST (0x5354)
    ﬓ (0xfb13) → ՄՆ (0x544546) → մն (0x574576) → ՄՆ (0x544546)
    ﬔ (0xfb14) → ՄԵ (0x544535) → մե (0x574565) → ՄԵ (0x544535)
    ﬕ (0xfb15) → ՄԻ (0x54453b) → մի (0x57456b) → ՄԻ (0x54453b)
    ﬖ (0xfb16) → ՎՆ (0x54e546) → վն (0x57e576) → ՎՆ (0x54e546)
    ﬗ (0xfb17) → ՄԽ (0x54453d) → մխ (0x57456d) → ՄԽ (0x54453d)
    
    0 讨论(0)
  • 2021-01-03 19:22

    This is the case when multiple letters are lower cases of the same letter.

    For example, the micro character µ (U+00B5) and the mu character μ (U+03BC):

    >>> u'\xb5'.swapcase()
    u'\u039c'
    >>> u'\u03bc'.swapcase()
    u'\u039c'
    

    The two are different characters, but their uppercase counterparts are the same. This means that when str.swapcase() is applied, they return the same character. However, doing this again can't (and won't) return both letters.

    >>> u'\xb5'.swapcase().swapcase()
    u'\u03bc'
    
    0 讨论(0)
提交回复
热议问题