问题
I'm struggling to figure out a reasonable solution to this. I need to replace the following characters: ⁰¹²³⁴⁵⁶⁷⁸⁹ using a regex replace. I would think that you would just do this:
item = item.replace(/[⁰¹²³⁴⁵⁶⁷⁸⁹]/g, '');
However, when I try to do that, notepad++ converts symbols 5-9 into regular script numbers. I realize this probably relates to the encoding format I am using, which I see is set to ANSI.
I've never really understood the difference between the various encoding formats. But I'm wondering if there is any easy fix for this issue?
回答1:
Here is the simple regex for finding all superscript numbers
/\p{No}/gu/
Breakdown:
\p{No}
matches a superscript or subscript digit, or a number that is not a digit [0-9]u modifier
: unicode: Pattern strings are treated as UTF-16. Also causes escape sequences to match unicode charactersg modifier
: global. All matches (don't return on first match)
https://regex101.com/r/zA8sJ4/1
Now, most modern browsers still have no built in support for unicode numbers in regex. I would recommend using the xregexp
library
XRegExp provides augmented (and extensible) JavaScript regular expressions. You get new modern syntax and flags beyond what browsers support natively. XRegExp is also a regex utility belt with tools to make your client-side grepping and parsing easier, while freeing you from worrying about pesky aspects of JavaScript regexes like cross-browser inconsistencies or manually manipulating lastIndex.
http://xregexp.com/
HTML Solution
HTML has a <sup>
tag for representing superscript text.
The tag defines superscript text. Superscript text appears half a character above the normal line, and is sometimes rendered in a smaller font. Superscript text can be used for footnotes, like WWW[1].
If there are superscript numbers, the html markup almost surely has the sup
tag.
var math = document.getElementById("math");
math.innerHTML = math.innerHTML.replace(/<sup>[\d]?<\/sup>/g, "");
<p id="math">4<sup>2</sup>+ 3<sup>2</sup></p>
回答2:
Use UTF-8. If for some reason you can't, a workaround is escaping
var rg = new RegExp(
"[\u2070\u00b9\u00b2\u00b3\u2074\u2075\u2076\u2077\u2078\u2079]",
"g"
);
回答3:
I'd suggest trying following regex:
/[\u2070-\u209f\u00b0-\u00be]+/g
Code will look like
var re = /[\u2070-\u209f\u00b0-\u00be]+/g;
var str = '⁰¹²³⁴⁵⁶⁷⁸⁹';
var subst = '';
var result = str.replace(re, subs);
result will contain after successful run:
2sometext
See demo here
来源:https://stackoverflow.com/questions/35976910/regex-to-replace-all-superscript-numbers