When setting environment variables in Apache RewriteRule directives, what causes the variable name to be prefixed with “REDIRECT_”?

半城伤御伤魂 提交于 2019-11-27 06:17:50

This behavior is unfortunate and doesn't even appear to be documented.

.htaccess per-dir context

Here's what appears to happen in .htaccess per-directory (per-dir) context:

Assume that Apache processes an .htaccess file that includes rewrite directives.

  1. Apache populates its environment variable map with all of the standard CGI / Apache variables

  2. Rewriting begins

  3. Environment variables are set in RewriteRule directives

  4. When Apache stops processing the RewriteRule directives (because of an L flag or the end of the ruleset) and the URL has been changed by a RewriteRule, Apache restarts request processing.

    If you're not familiar with this part, see the L flag documentation:

    thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.
  5. From what I can observe, I believe that when #4 happens, #1 is repeated, then the environment variables that were set in RewriteRule directives are prepended with REDIRECT_ and added to the environment vars map (not necessarily in that order, but the end result consisting of that combination).

    This step is where the chosen variable names are wiped out, and in a moment I will explain why that is so important and inconvenient.

Restoring variable names

When I originally ran into this issue, I was doing something like the following in .htaccess (simplified):

RewriteCond %{HTTP_HOST} (.+)\.projects\.

RewriteRule (.*) subdomains/%1/docroot/$1

RewriteRule (.+/docroot)/ - [L,E=EFFECTIVE_DOCUMENT_ROOT:$1]

If I were to set the environment variable in the first RewriteRule, Apache would restart the rewriting process and prepend the variable with REDIRECT_ (steps #4 & 5 above), thus I'd lose access to it via the name I assigned.

In this case, the first RewriteRule changes the URL, so after both RewriteRules are processed, Apache restarts the procedure and processes the .htaccess again. The second time, the first RewriteRule is skipped because of the RewriteCond directive, but the second RewriteRule matches, sets the environment variable (again), and, importantly, doesn't change the URL. So the request / rewriting process does not start over, and the variable name I chose sticks. In this case I actually have both REDIRECT_EFFECTIVE_DOCUMENT_ROOT and EFFECTIVE_DOCUMENT_ROOT. If I were to use an L flag on the first RewriteRule, I'd only have EFFECTIVE_DOCUMENT_ROOT.

@trowel's partial solution works similarly: the rewrite directives are processed again, the renamed variable is assigned to the original name again, and if the URL does not change, the process is over and the assigned variable name sticks.

Why those techniques are inadequate

Both of those techniques suffer from a major flaw: when the rewrite rules in the .htaccess file where you set environment variables rewrite the URL to a more deeply nested directory that has an .htaccess file that does any rewriting, your assigned variable name is wiped out again.

Say you have a directory layout like this:

docroot/
        .htaccess
        A.php
        B.php
        sub/
                .htaccess
                A.php
                B.php

And a docroot/.htaccess like this:

RewriteRule ^A\.php sub/B.php [L]

RewriteRule .* - [E=MAJOR:flaw]

So you request /A.php, and it's rewritten to sub/B.php. You still have your MAJOR variable.

However, if you have any rewrite directives in docroot/sub/.htaccess (even just RewriteEngine Off or RewriteEngine On), your MAJOR variable disappears. That's because once the URL is rewritten to sub/B.php, docroot/sub/.htaccess is processed, and if it contains any rewrite directives, rewrite directives in docroot/.htaccess are not processed again. If you had a REDIRECT_MAJOR after docroot/.htaccess was processed (e.g. if you omit the L flag from the first RewriteRule), you'll still have it, but those directives won't run again to set your chosen variable name.

Inheritance

So, say you want to:

  1. set environment variables in RewriteRule directives at a particular level of the directory tree (like docroot/.htaccess)

  2. have them available in scripts at deeper levels

  3. have them available with the assigned names

  4. be able to have rewrite directives in more deeply nested .htaccess files

A possible solution is to use RewriteOptions inherit directives in the more deeply nested .htaccess files. That allows you to re-run the rewrite directives in less deeply nested files and use the techniques outlined above to set the variables with the chosen names. However, note that this increases complexity because you have to be more careful crafting the rewrite directives in the less deeply nested files so that they don't cause problems when run again from the more deeply nested directories. I believe Apache strips the per-dir prefix for the more deeply nested directory and runs the rewrite directives in the less deeply nested files on that value.

@trowel's technique

As far as I can see, support for using a construct like %{ENV:REDIRECT_VAR} in the value component of a RewriteRule E flag (e.g. [E=VAR:%{ENV:REDIRECT_VAR}]) does not appear to be documented:

VAL may contain backreferences ($N or %N) which will be expanded.

It does appear to work, but if you want to avoid relying on something undocumented (please correct me if I'm wrong about that), it can easily be done this way instead:

RewriteCond %{ENV:REDIRECT_VAR} (.+)
RewriteRule .* - [E=VAR:%1]

SetEnvIf

I don't recommend relying on this, because it doesn't seem to be consistent with the documented behavior (see below), but this (in docroot/.htaccess, with Apache 2.2.20) works for me:

SetEnvIf REDIRECT_VAR (.+) VAR=$1

Only those environment variables defined by earlier SetEnvIf[NoCase] directives are available for testing in this manner.

Why?

I don't know what the rationale for prefixing these names with REDIRECT_ is -- not surprising, since it doesn't appear to be mentioned in the Apache documentation sections for mod_rewrite directives, RewriteRule flags, or environment variables.

At the moment it seems like a big nuisance to me, in the absence of an explanation for why it's better than leaving the assigned names alone. The lack of documentation only contributes to my skepticism about it.

Being able to assign environment variables in rewrite rules is useful, or at least, it would be. But the usefulness is greatly diminished by this name-changing behavior. The complexity of this post illustrates how nuts this behavior and the hoops that have to be jumped through to try to overcome it are.

I haven't tested this at all and I know it doesn't address points A or B, but there is some description of this issue in the comments in PHP documentation and some possible solutions for accessing these variables using $_SERVER['VAR']:

http://www.php.net/manual/en/reserved.variables.php#79811

EDIT - some more responses to the question offered:

A: The environment variables are renamed by Apache if they are involved in a redirect. For example, if you have the following rule:

RewriteRule ^index.php - [E=VAR1:'hello',E=VAR2:'world']

Then you may access VAR1 and VAR2 using $_SERVER['VAR1'] and $_SERVER['VAR2']. However, if you redirect the page like so:

RewriteRule ^index.php index2.php [E=VAR1:'hello',E=VAR2:'world']

Then you must use $_SERVER['REDIRECT_VAR1'], etc.

B: The best way to overcome this issue is to process the variables that you're interested in using PHP. Create a function that runs through the $_SERVER array and finds the items that you need. You might even use a function like this:

function myGetEnv($key) {
    $prefix = "REDIRECT_";
    if(array_key_exists($key, $_SERVER))
        return $_SERVER[$key];
    foreach($_SERVER as $k=>$v) {
        if(substr($k, 0, strlen($prefix)) == $prefix) {
            if(substr($k, -(strlen($key))) == $key)
                return $v;
        }
    }
    return null;
}

As I don't want to change any of my code (nor can I change code of the libraries used), I went with the following approach: whilst bootstrapping my application – e.g. in my index.php – I rework the $_ENV superglobal so that variables prefixed with REDIRECT_ are rewritten to their normal intended name:

// Fix ENV vars getting prepended with `REDIRECT_` by Apache
foreach ($_ENV as $key => $value) {
    if (substr($key, 0, 9) === 'REDIRECT_') {
        $_ENV[str_replace('REDIRECT_', '', $key)] = $value;
        putenv(str_replace('REDIRECT_', '', $key) . '=' . $value);
    }
}

Not only do we directly set it in $_ENV, but we also store it using putenv(). This way existing code and libraries – which might use getenv() – can work fine.


On a sidenote: if you're extracting headers – like HTTP_AUTHORIZATION – in your code, you need to do the same kind of manipulation on $_SERVER:

foreach ($_SERVER as $key => $value) {
    if (substr($key, 0, 9) === 'REDIRECT_') {
        $_SERVER[str_replace('REDIRECT_', '', $key)] = $value;
    }
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!