The Python docs for str.swapcase() say:
Note that it is not necessarily true that
s.swapcase().swapcase() == s
.
I\'
I tried this
v = lambda x: x.swapcase().swapcase() == x
[unichr(x) for x in range(10000) if not v(unichr(x))]
Which results in these:
[u'\xb5', u'\u0130', u'\u0131', u'\u017f', u'\u03c2', u'\u03d0', u'\u03d1', u'\u03d5', u'\u03d6', u'\u03f0', u'\u03f1', u'\u03f4', u'\u03f5', u'\u1e9b', u'\u1e9e', u'\u1f80', u'\u1f81', u'\u1f82', u'\u1f83', u'\u1f84', u'\u1f85', u'\u1f86', u'\u1f87', u'\u1f90', u'\u1f91', u'\u1f92', u'\u1f93', u'\u1f94', u'\u1f95', u'\u1f96', u'\u1f97', u'\u1fa0', u'\u1fa1', u'\u1fa2', u'\u1fa3', u'\u1fa4', u'\u1fa5', u'\u1fa6', u'\u1fa7', u'\u1fb3', u'\u1fbe', u'\u1fc3', u'\u1ff3', u'\u2126', u'\u212a', u'\u212b']
While Volatility brought up the example of the uppercase mu and uppercase micro resolving to the same Unicode codepoint, here's another interesting situation where applying swapcase
twice results in a different answer:
>>> 'ß'.swapcase().swapcase()
'ss'
Confused? The German consonant ß (pronounced [s]) becomes SS
after one application of swapcase
and then ss
after the second.
Here's the whole list of them (→ represents one swapcase
):
µ (0xb5) → Μ (0x39c) → μ (0x3bc) → Μ (0x39c)
ß (0xdf) → SS (0x5353) → ss (0x7373) → SS (0x5353)
İ (0x130) → i̇ (0x69307) → İ (0x49307) → i̇ (0x69307)
ı (0x131) → I (0x49) → i (0x69) → I (0x49)
ʼn (0x149) → ʼN (0x2bc4e) → ʼn (0x2bc6e) → ʼN (0x2bc4e)
ſ (0x17f) → S (0x53) → s (0x73) → S (0x53)
ǰ (0x1f0) → J̌ (0x4a30c) → ǰ (0x6a30c) → J̌ (0x4a30c)
ͅ (0x345) → Ι (0x399) → ι (0x3b9) → Ι (0x399)
ΐ (0x390) → Ϊ́ (0x399308301) → ΐ (0x3b9308301) → Ϊ́ (0x399308301)
ΰ (0x3b0) → Ϋ́ (0x3a5308301) → ΰ (0x3c5308301) → Ϋ́ (0x3a5308301)
ς (0x3c2) → Σ (0x3a3) → σ (0x3c3) → Σ (0x3a3)
ϐ (0x3d0) → Β (0x392) → β (0x3b2) → Β (0x392)
ϑ (0x3d1) → Θ (0x398) → θ (0x3b8) → Θ (0x398)
ϕ (0x3d5) → Φ (0x3a6) → φ (0x3c6) → Φ (0x3a6)
ϖ (0x3d6) → Π (0x3a0) → π (0x3c0) → Π (0x3a0)
ϰ (0x3f0) → Κ (0x39a) → κ (0x3ba) → Κ (0x39a)
ϱ (0x3f1) → Ρ (0x3a1) → ρ (0x3c1) → Ρ (0x3a1)
ϴ (0x3f4) → θ (0x3b8) → Θ (0x398) → θ (0x3b8)
ϵ (0x3f5) → Ε (0x395) → ε (0x3b5) → Ε (0x395)
և (0x587) → ԵՒ (0x535552) → եւ (0x565582) → ԵՒ (0x535552)
ẖ (0x1e96) → H̱ (0x48331) → ẖ (0x68331) → H̱ (0x48331)
ẗ (0x1e97) → T̈ (0x54308) → ẗ (0x74308) → T̈ (0x54308)
ẘ (0x1e98) → W̊ (0x5730a) → ẘ (0x7730a) → W̊ (0x5730a)
ẙ (0x1e99) → Y̊ (0x5930a) → ẙ (0x7930a) → Y̊ (0x5930a)
ẚ (0x1e9a) → Aʾ (0x412be) → aʾ (0x612be) → Aʾ (0x412be)
ẛ (0x1e9b) → Ṡ (0x1e60) → ṡ (0x1e61) → Ṡ (0x1e60)
ẞ (0x1e9e) → ß (0xdf) → SS (0x5353) → ss (0x7373) → SS (0x5353)
ὐ (0x1f50) → Υ̓ (0x3a5313) → ὐ (0x3c5313) → Υ̓ (0x3a5313)
ὒ (0x1f52) → Υ̓̀ (0x3a5313300) → ὒ (0x3c5313300) → Υ̓̀ (0x3a5313300)
ὔ (0x1f54) → Υ̓́ (0x3a5313301) → ὔ (0x3c5313301) → Υ̓́ (0x3a5313301)
ὖ (0x1f56) → Υ̓͂ (0x3a5313342) → ὖ (0x3c5313342) → Υ̓͂ (0x3a5313342)
ᾀ (0x1f80) → ἈΙ (0x1f08399) → ἀι (0x1f003b9) → ἈΙ (0x1f08399)
ᾁ (0x1f81) → ἉΙ (0x1f09399) → ἁι (0x1f013b9) → ἉΙ (0x1f09399)
ᾂ (0x1f82) → ἊΙ (0x1f0a399) → ἂι (0x1f023b9) → ἊΙ (0x1f0a399)
ᾃ (0x1f83) → ἋΙ (0x1f0b399) → ἃι (0x1f033b9) → ἋΙ (0x1f0b399)
ᾄ (0x1f84) → ἌΙ (0x1f0c399) → ἄι (0x1f043b9) → ἌΙ (0x1f0c399)
ᾅ (0x1f85) → ἍΙ (0x1f0d399) → ἅι (0x1f053b9) → ἍΙ (0x1f0d399)
ᾆ (0x1f86) → ἎΙ (0x1f0e399) → ἆι (0x1f063b9) → ἎΙ (0x1f0e399)
ᾇ (0x1f87) → ἏΙ (0x1f0f399) → ἇι (0x1f073b9) → ἏΙ (0x1f0f399)
ᾐ (0x1f90) → ἨΙ (0x1f28399) → ἠι (0x1f203b9) → ἨΙ (0x1f28399)
ᾑ (0x1f91) → ἩΙ (0x1f29399) → ἡι (0x1f213b9) → ἩΙ (0x1f29399)
ᾒ (0x1f92) → ἪΙ (0x1f2a399) → ἢι (0x1f223b9) → ἪΙ (0x1f2a399)
ᾓ (0x1f93) → ἫΙ (0x1f2b399) → ἣι (0x1f233b9) → ἫΙ (0x1f2b399)
ᾔ (0x1f94) → ἬΙ (0x1f2c399) → ἤι (0x1f243b9) → ἬΙ (0x1f2c399)
ᾕ (0x1f95) → ἭΙ (0x1f2d399) → ἥι (0x1f253b9) → ἭΙ (0x1f2d399)
ᾖ (0x1f96) → ἮΙ (0x1f2e399) → ἦι (0x1f263b9) → ἮΙ (0x1f2e399)
ᾗ (0x1f97) → ἯΙ (0x1f2f399) → ἧι (0x1f273b9) → ἯΙ (0x1f2f399)
ᾠ (0x1fa0) → ὨΙ (0x1f68399) → ὠι (0x1f603b9) → ὨΙ (0x1f68399)
ᾡ (0x1fa1) → ὩΙ (0x1f69399) → ὡι (0x1f613b9) → ὩΙ (0x1f69399)
ᾢ (0x1fa2) → ὪΙ (0x1f6a399) → ὢι (0x1f623b9) → ὪΙ (0x1f6a399)
ᾣ (0x1fa3) → ὫΙ (0x1f6b399) → ὣι (0x1f633b9) → ὫΙ (0x1f6b399)
ᾤ (0x1fa4) → ὬΙ (0x1f6c399) → ὤι (0x1f643b9) → ὬΙ (0x1f6c399)
ᾥ (0x1fa5) → ὭΙ (0x1f6d399) → ὥι (0x1f653b9) → ὭΙ (0x1f6d399)
ᾦ (0x1fa6) → ὮΙ (0x1f6e399) → ὦι (0x1f663b9) → ὮΙ (0x1f6e399)
ᾧ (0x1fa7) → ὯΙ (0x1f6f399) → ὧι (0x1f673b9) → ὯΙ (0x1f6f399)
ᾲ (0x1fb2) → ᾺΙ (0x1fba399) → ὰι (0x1f703b9) → ᾺΙ (0x1fba399)
ᾳ (0x1fb3) → ΑΙ (0x391399) → αι (0x3b13b9) → ΑΙ (0x391399)
ᾴ (0x1fb4) → ΆΙ (0x386399) → άι (0x3ac3b9) → ΆΙ (0x386399)
ᾶ (0x1fb6) → Α͂ (0x391342) → ᾶ (0x3b1342) → Α͂ (0x391342)
ᾷ (0x1fb7) → Α͂Ι (0x391342399) → ᾶι (0x3b13423b9) → Α͂Ι (0x391342399)
ι (0x1fbe) → Ι (0x399) → ι (0x3b9) → Ι (0x399)
ῂ (0x1fc2) → ῊΙ (0x1fca399) → ὴι (0x1f743b9) → ῊΙ (0x1fca399)
ῃ (0x1fc3) → ΗΙ (0x397399) → ηι (0x3b73b9) → ΗΙ (0x397399)
ῄ (0x1fc4) → ΉΙ (0x389399) → ήι (0x3ae3b9) → ΉΙ (0x389399)
ῆ (0x1fc6) → Η͂ (0x397342) → ῆ (0x3b7342) → Η͂ (0x397342)
ῇ (0x1fc7) → Η͂Ι (0x397342399) → ῆι (0x3b73423b9) → Η͂Ι (0x397342399)
ῒ (0x1fd2) → Ϊ̀ (0x399308300) → ῒ (0x3b9308300) → Ϊ̀ (0x399308300)
ΐ (0x1fd3) → Ϊ́ (0x399308301) → ΐ (0x3b9308301) → Ϊ́ (0x399308301)
ῖ (0x1fd6) → Ι͂ (0x399342) → ῖ (0x3b9342) → Ι͂ (0x399342)
ῗ (0x1fd7) → Ϊ͂ (0x399308342) → ῗ (0x3b9308342) → Ϊ͂ (0x399308342)
ῢ (0x1fe2) → Ϋ̀ (0x3a5308300) → ῢ (0x3c5308300) → Ϋ̀ (0x3a5308300)
ΰ (0x1fe3) → Ϋ́ (0x3a5308301) → ΰ (0x3c5308301) → Ϋ́ (0x3a5308301)
ῤ (0x1fe4) → Ρ̓ (0x3a1313) → ῤ (0x3c1313) → Ρ̓ (0x3a1313)
ῦ (0x1fe6) → Υ͂ (0x3a5342) → ῦ (0x3c5342) → Υ͂ (0x3a5342)
ῧ (0x1fe7) → Ϋ͂ (0x3a5308342) → ῧ (0x3c5308342) → Ϋ͂ (0x3a5308342)
ῲ (0x1ff2) → ῺΙ (0x1ffa399) → ὼι (0x1f7c3b9) → ῺΙ (0x1ffa399)
ῳ (0x1ff3) → ΩΙ (0x3a9399) → ωι (0x3c93b9) → ΩΙ (0x3a9399)
ῴ (0x1ff4) → ΏΙ (0x38f399) → ώι (0x3ce3b9) → ΏΙ (0x38f399)
ῶ (0x1ff6) → Ω͂ (0x3a9342) → ῶ (0x3c9342) → Ω͂ (0x3a9342)
ῷ (0x1ff7) → Ω͂Ι (0x3a9342399) → ῶι (0x3c93423b9) → Ω͂Ι (0x3a9342399)
Ω (0x2126) → ω (0x3c9) → Ω (0x3a9) → ω (0x3c9)
K (0x212a) → k (0x6b) → K (0x4b) → k (0x6b)
Å (0x212b) → å (0xe5) → Å (0xc5) → å (0xe5)
ff (0xfb00) → FF (0x4646) → ff (0x6666) → FF (0x4646)
fi (0xfb01) → FI (0x4649) → fi (0x6669) → FI (0x4649)
fl (0xfb02) → FL (0x464c) → fl (0x666c) → FL (0x464c)
ffi (0xfb03) → FFI (0x464649) → ffi (0x666669) → FFI (0x464649)
ffl (0xfb04) → FFL (0x46464c) → ffl (0x66666c) → FFL (0x46464c)
ſt (0xfb05) → ST (0x5354) → st (0x7374) → ST (0x5354)
st (0xfb06) → ST (0x5354) → st (0x7374) → ST (0x5354)
ﬓ (0xfb13) → ՄՆ (0x544546) → մն (0x574576) → ՄՆ (0x544546)
ﬔ (0xfb14) → ՄԵ (0x544535) → մե (0x574565) → ՄԵ (0x544535)
ﬕ (0xfb15) → ՄԻ (0x54453b) → մի (0x57456b) → ՄԻ (0x54453b)
ﬖ (0xfb16) → ՎՆ (0x54e546) → վն (0x57e576) → ՎՆ (0x54e546)
ﬗ (0xfb17) → ՄԽ (0x54453d) → մխ (0x57456d) → ՄԽ (0x54453d)
This is the case when multiple letters are lower cases of the same letter.
For example, the micro character µ
(U+00B5) and the mu character μ
(U+03BC):
>>> u'\xb5'.swapcase()
u'\u039c'
>>> u'\u03bc'.swapcase()
u'\u039c'
The two are different characters, but their uppercase counterparts are the same. This means that when str.swapcase()
is applied, they return the same character. However, doing this again can't (and won't) return both letters.
>>> u'\xb5'.swapcase().swapcase()
u'\u03bc'