问题
For [[Test#?]], I get "Test#.3F" from action=parse
bit of MediaWiki API. What is this encoding and how do I bring it to human readable format using Perl's CPAN?
URI::Encode works for the percent decoding, but not the section names one.
回答1:
It is UTF-8 percent-encoding, but with .
instead of %
, and spaces replaced with underscores; additionally, multiple consecutive whitespaces are collapsed, and :
is preserved (not encoded into .3A
).
The exact code which handles it is Parser::guessSectionNameFromWikiText(), but if you do not want to dig through a lot of code, check the much simpler implementation in an older MediaWiki version (compatible except for a few edge cases), in anchorencode():
str_replace( '%', '.', str_replace('+', '_', urlencode( $text ) ) );
来源:https://stackoverflow.com/questions/15128485/mediawiki-api-section-names-encoding