Test if string is URL encoded in PHP

前端 未结 13 2094
不思量自难忘°
不思量自难忘° 2021-01-01 11:10

How can I test if a string is URL encoded?

Which of the following approaches is better?

  • Search the string for characters which would be encoded, which
相关标签:
13条回答
  • 2021-01-01 11:40

    There's no reliable way to do this, as there are strings which stay the same through the encoding process, i.e. is "abc" encoded or not? There's no clear answer. Also, as you've encountered, some characters have multiple encodings... But...

    Your decode-check-encode-check scheme fails due to the fact that some characters may be encoded in more than one way. However, a slight modification to your function should be fairly reliable, just check if the decode modifies the string, if it does, it was encoded.

    It won't be fool proof of course, as "10+20=30" will return true (+ gets converted to space), but we're actually just doing arithmetic. I suppose this is what you're scheme is attempting to counter, I'm sorry to say that I don't think there's a perfect solution.

    HTH.

    Edit:
    As I entioned in my own comment (just reiterating here for clarity), a good compromise would probably be to check for invalid characters in your url (e.g. space), and if there are some it's not encoded. If there are none, try to decode and see if the string changes. This still won't handle the arithmetic above (which is impossible), but it'll hopefully be sufficient.

    0 讨论(0)
  • 2021-01-01 11:41

    I found.
    The url is For Exapmle: https://example.com/xD?foo=bar&uri=https%3A%2F%2Fexample.com%2FxD
    You need Found $_GET['uri'] is encoded or not:

    preg_match("/.*uri=(.*)&?.*/", $_SERVER['REQUEST_URI'], $r);
    if (isset($_GET['uri']) && urldecode($r['1']) === $r['1']) {
      // Code Here if url is not encoded
    }
    
    0 讨论(0)
  • 2021-01-01 11:41

    In my case I wanted to check if a complete URL is encoded, so I already knew that the URL must contain the string https://, and what I did was to check if the string had the encoded version of https:// in it (https%3A%2F%2F) and if it didn't, then I knew it was not encoded:

    //make sure $completeUrl is encoded
    if (strpos($completeUrl, urlencode('https://')) === false) {
        // not encoded, need to encode it
        $completeUrl = urlencode($completeUrl);
    }
    

    in theory this solution can be used with any string other than complete URLs, as long as you know part of the string (https:// in this example) will always exists in what you are trying to check.

    0 讨论(0)
  • 2021-01-01 11:45

    @user187291 code works and only fails when + is not encoded.

    I know this is very old post. But this worked to me.

    $is_encoded = preg_match('~%[0-9A-F]{2}~i', $string);
    if($is_encoded) {
     $string  = urlencode(urldecode(str_replace(['+','='], ['%2B','%3D'], $string)));
    } else {
      $string = urlencode($string);
    }
    
    0 讨论(0)
  • 2021-01-01 11:50

    private static boolean isEncodedText(String val, String... encoding) throws UnsupportedEncodingException { String decodedText = URLDecoder.decode(val, TransformFetchConstants.DEFAULT_CHARSET);

        if(encoding != null && encoding.length > 0){
            decodedText = URLDecoder.decode(val, encoding[0]);
        }
    
        String encodedText =  URLEncoder.encode(decodedText);
    
        return encodedText.equalsIgnoreCase(val) || !decodedText.equalsIgnoreCase(val);
    
    }
    
    0 讨论(0)
  • 2021-01-01 11:54

    You'll never know for sure if a string is URL-encoded or if it was supposed to have the sequence %2B in it. Instead, it probably depends on where the string came from, i.e. if it was hand-crafted or from some application.

    Is it better to search the string for characters which would be encoded, which aren't, and if any exist then its not encoded.

    I think this is a better approach, since it would take care of things that have been done programmatically (assuming the application would not have left a non-encoded character behind).

    One thing that will be confusing here... Technically, the % "should be" encoded if it will be present in the final value, since it is a special character. You might have to combine your approaches to look for should-be-encoded characters as well as validating that the string decodes successfully if none are found.

    0 讨论(0)
提交回复
热议问题