Tokenize .htaccess files

房东的猫 提交于 2019-12-07 11:20:58

问题


Bet you didn't see this coming? ;)

So, a project of mine requires that I specifically read and make sense out of .htaccess files.

Sadly, searching on Google only yields the infinite woes of people trying to get their own .htaccess to work (sorry, couldn't resist the comment).

Anyway, I'm a bit scared of trying to get this thing out of open-source projects that use it. See, in the past few weeks, I ended up wasting a lot of time trying to fix my issues with this strategy, only to find out that I did better to read RFCs & specs and build the thing my way.

So, if you know about a library, or any (hopefully clean!) code that does this, please do share. In the mean time, if you know about any articles about .htaccess file format, I'm sure they'll be very handy. Thanks.

NB: I'm pretty much multilingual and could make use of any codebase, even though the end code will be Delphi. I know I'm asking too much, but I'd love to see less of C++. Just think of my mental health before sharing C++ code. :)

Edit: Well, I think I'm just going to do this manually myself. The file structure seems to be:

directive arg1 arg2 argN
<begin directive section>
</end directive section>
# single line comment

回答1:


.htaccess grammar is actually the exact same as the Apache configuration itself, and example parsers do exist for it.

If you're looking to write your own, you are mostly correct on the format. Remember, section tags can be nested and can have parameters (like <Location />)

English method of parsing:

For each line in the file:
  Strip whitespace from beginning and end of line.
  If the line starts with a '#':
    Parse it as a comment (or skip it)

  Else, If the line starts with a '<':
    If the next character is a '/', the line is a closing tag:
      Seek to the next '>' to get the tag name, and pop it from the tag stack.
    Else, the line is an opening tag:
      Seek to the next '>' for the tag name.
      If the tag, trimmed, contains whitespace:
        Split on the first whitespace. The right side is params, left is the tag. 
        (IfModule, Location, etc use this)

      Push the tag name to the tag stack.

  Else, the line is a directive:
    Split the line on whitespace. This is the directive and params.

Just add quote handling and you're set.



来源:https://stackoverflow.com/questions/7112069/tokenize-htaccess-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!