Resolve a relative path in a URL with PHP

前端 未结 2 1811
独厮守ぢ
独厮守ぢ 2021-01-13 22:49

Example 1: domain.com/dir_1/dir_2/dir_3/./../../../
Should resolve naturally in the browser into = domain.com/

Example 2: domain.c

相关标签:
2条回答
  • 2021-01-13 23:34

    This is a more simple problem then you are thinking about it. All you need to do is explode() on the / character, and parse out all of the individual segments using a stack. As you traverse the array from left to right, if you see ., do nothing. If you see .., pop an element from the stack. Otherwise, push an element onto the stack.

    $str = 'domain.com/dir_1/dir_2/dir_3/./../../../';
    $array = explode( '/', $str);
    $domain = array_shift( $array);
    
    $parents = array();
    foreach( $array as $dir) {
        switch( $dir) {
            case '.':
            // Don't need to do anything here
            break;
            case '..':
                array_pop( $parents);
            break;
            default:
                $parents[] = $dir;
            break;
        }
    }
    
    echo $domain . '/' . implode( '/', $parents);
    

    This will properly resolve the URLs in all of your test cases.

    Note that error checking is left as an exercise to the user (i.e. when the $parents stack is empty and you try to pop something off of it).

    0 讨论(0)
  • 2021-01-13 23:39

    What you want here is a "replaceDots" function.

    It works by remembering the position of the last valid item and then if you get dots then removing the item. The full description is here "Remove Dot Segments" http://tools.ietf.org/html/rfc3986. Search for Remove Dot Segments at the RFC page.

    You need more than one loop. The inner loop scans ahead and looks at the next part and then if it is dots the current part is skipped etc, but it can be trickier than that. Or consider breaking it up into parts and then following the algorithm.

    1. While the input buffer is not empty, loop as follows:

      A. If the input buffer begins with a prefix of "../" or "./", then remove that prefix from the input buffer; otherwise,

      B. if the input buffer begins with a prefix of "/./" or "/.", where "." is a complete path segment, then replace that prefix with "/" in the input buffer; otherwise,

      C. if the input buffer begins with a prefix of "/../" or "/..", where ".." is a complete path segment, then replace that prefix with "/" in the input buffer and remove the last segment and its preceding "/" (if any) from the output buffer; otherwise,

      D. if the input buffer consists only of "." or "..", then remove that from the input buffer; otherwise,

      E. move the first path segment in the input buffer to the end of the output buffer, including the initial "/" character (if any) and any subsequent characters up to, but not including, the next "/" character or the end of the input buffer.

      1. Finally, the output buffer is returned as the result of remove_dot_segments. function.

    It works by remembering the position of the last valid item and then if you get dots then removing the item. The full description is here

    HERE IS MY VERSION OF IT IN C++...

    ortl_funcimp(len_t) _str_remove_dots(char_t* s, len_t len) {
      len_t x,yy;
      /*
        Modifies the string in place by copying parts back. Not
        sure if this is the best way to do it since it involves
        many copies for deep relatives like ../../../../../myFile.cpp
    
        For each ../ it does one copy back. If the loop was implemented
        using writing into a buffer, you would have to do both, so this
        seems to be the best technique.
      */
      __checklenx(s,len);
      x = 0;
      while (x < len) {
        if (s[x] == _c('.')) {
          x++;
          if (x < len) {
            if (s[x] == _c('.')) {
              x++;
              if (x < len) {
                if (s[x] == _c('/')) { // ../
                  mem_move(&s[x],&s[x-2],(len-x)*sizeof(char_t));
                  len -= 2;
                  x -= 2;
                }
                else x++;
              }
              else len -= 2;// .. only
            }
            else if (s[x] == _c('/')){ // ./
              mem_move(&s[x],&s[x-1],(len-x)*sizeof(char_t));
              len--;
              x--;
            }
          }
          else --len;// terminating '.', remove
        }
        else if (s[x] == _c('/')) {
          x++;
          if (x < len) {
            if (s[x] == _c('.')) {
              x++;
              if (x < len) {
                if (s[x] == _c('/')) { // /./
                  mem_move(&s[x],&s[x-2],(len-x)*sizeof(char_t));
                  len -= 2;
                  x -= 2;
                }
                else if (s[x] == _c('.')) { // /..
                  x++;
                  if (x < len) { //
                    if (s[x] == _c('/')) {// /../
                      yy = x;
                      x -= 3;
                      if (x > 0) x--;
                      while ((x > 0) && (s[x] != _c('/'))) x--;
                      mem_move(&s[yy],&s[x],(len-yy) * sizeof(char_t));
                      len -= (yy - x);
                    }
                    else {
                      x++;
                    }
                  }
                  else {// ends with /..
                    x -= 3;
                    if (x > 0) x--;
                    while (x > 0 && s[x] != _c('/')) x--;
                    s[x] = _c('/');
                    x++;
                    len = x;
                  }
                }
                else x++;
              }
              else len--;// ends with /.
            }
            else x++;
          }
        }
        else x++;
      }
      return len;
    }
    
    0 讨论(0)
提交回复
热议问题