MediaWiki API section names encoding

好久不见. 提交于 2020-01-02 19:25:11

问题


For [[Test#?]], I get "Test#.3F" from action=parse bit of MediaWiki API. What is this encoding and how do I bring it to human readable format using Perl's CPAN?

URI::Encode works for the percent decoding, but not the section names one.


回答1:


It is UTF-8 percent-encoding, but with . instead of %, and spaces replaced with underscores; additionally, multiple consecutive whitespaces are collapsed, and : is preserved (not encoded into .3A).

The exact code which handles it is Parser::guessSectionNameFromWikiText(), but if you do not want to dig through a lot of code, check the much simpler implementation in an older MediaWiki version (compatible except for a few edge cases), in anchorencode():

str_replace( '%', '.', str_replace('+', '_', urlencode( $text ) ) );


来源:https://stackoverflow.com/questions/15128485/mediawiki-api-section-names-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!