Regex to compare strings with Umlaut and non-Umlaut variations

后端 未结 7 1717
刺人心
刺人心 2021-01-13 10:34

Can anyone help me with a javascript regular expression that I can use to compare strings that are the same, taking into acccount their non-Umlaut-ed versions.

for e

7条回答
  •  -上瘾入骨i
    2021-01-13 10:54

    Regular expressions aren't quite powerful enough to do this properly, though you could hack it into almost working with them.

    What you want is called Unicode Normalization. A Normalized string is one converted to a common form so you can compare them. You tagged your post "javascript", however, Javascript doesn't have a built in standard library to do this, and I am not aware of one offhand. Most server-side languages do have one, though. For example, the Normalizer Class in PHP. Python and Perl have equivalents, as do Microsoft stuff, I'm sure.

    Check out the wikipedia article on Unicode Equivalence for more information.

提交回复
热议问题