PHP unserialize fails with non-encoded characters?

后端未结

关注

 14  1592

$ser = \'a:2:{i:0;s:5:\"héllö\";i:1;s:5:\"wörld\";}\'; // fails
$ser2 = \'a:2:{i:0;s:5:\"hello\";i:1;s:5:\"world\";}\'; // works
$out = unserialize($ser);
$out2 = un


                      
              相关标签:


      
      
        
          14条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  伪装坚强ぢ        
                
              
                            
                2020-11-27 16:35
              
            
            
                                                                       
In my case the problem was with line endings (likely some editor have changed my file from DOS to Unix). 

I put together these apadtive wrappers:

function unserialize_fetchError($original, &$unserialized, &$errorMsg) {
    $unserialized = @unserialize($original);
    $errorMsg = error_get_last()['message'];
    return ( $unserialized !== false || $original == 'b:0;' );  // "$original == serialize(false)" is a good serialization even if deserialization actually returns false
}

function unserialize_checkAllLineEndings($original, &$unserialized, &$errorMsg, &$lineEndings) {
    if ( unserialize_fetchError($original, $unserialized, $errorMsg) ) {
        $lineEndings = 'unchanged';
        return true;
    } elseif ( unserialize_fetchError(str_replace("\n", "\n\r", $original), $unserialized, $errorMsg) ) {
        $lineEndings = '\n to \n\r';
        return true;
    } elseif ( unserialize_fetchError(str_replace("\n\r", "\n", $original), $unserialized, $errorMsg) ) {
        $lineEndings = '\n\r to \n';
        return true;
    } elseif ( unserialize_fetchError(str_replace("\r\n", "\n", $original), $unserialized, $errorMsg) ) {
        $lineEndings = '\r\n to \n';
        return true;
    } //else
    return false;
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  眼角桃花        
                
              
                            
                2020-11-27 16:37
              
            
            
                                                                       
Serialize:

foreach ($income_data as $key => &$value)
{
    $value = urlencode($value);
}
$data_str = serialize($income_data);


Unserialize:

$data = unserialize($data_str);
foreach ($data as $key => &$value)
{
    $value = urldecode($value);
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2020-11-27 16:39
              
            
            
                                                                       
this one worked for me.

function mb_unserialize($string) {
    $string = mb_convert_encoding($string, "UTF-8", mb_detect_encoding($string, "UTF-8, ISO-8859-1, ISO-8859-15", true));
    $string = preg_replace_callback(
        '/s:([0-9]+):"(.*?)";/',
        function ($match) {
            return "s:".strlen($match[2]).":\"".$match[2]."\";"; 
        },
        $string
    );
    return unserialize($string);
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  春和景丽        
                
              
                            
                2020-11-27 16:41
              
            
            
                                                                       
In reply to @Lionel above, in fact the function mb_unserialize() as you proposed won't work if the serialized string itself contains char sequence  "; (quote followed by semicolon). 
Use with caution. For example:

$test = 'test";string'; 
// $test is now 's:12:"test";string";'
$string = preg_replace('!s:(\d+):"(.*?)";!se', "'s:'.strlen('$2').':\"$2\";'", $test);
print $string; 
// output: s:4:"test";string";  (Wrong!!)


JSON is the ways to go, as mentioned by others, IMHO

Note: I post this as new answer as I don't know how to reply directly (new here). 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2020-11-27 16:42
              
            
            
                                                                       
The reason why unserialize() fails with:

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}';


Is because the length for héllö and wörld are wrong, since PHP doesn't correctly handle multi-byte strings natively:

echo strlen('héllö'); // 7
echo strlen('wörld'); // 6


However if you try to unserialize() the following correct string:

$ser = 'a:2:{i:0;s:7:"héllö";i:1;s:6:"wörld";}';

echo '<pre>';
print_r(unserialize($ser));
echo '</pre>';


It works:

Array
(
    [0] => héllö
    [1] => wörld
)


If you use PHP serialize() it should correctly compute the lengths of multi-byte string indexes.

On the other hand, if you want to work with serialized data in multiple (programming) languages you should forget it and move to something like JSON, which is way more standardized.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  小蘑菇        
                
              
                            
                2020-11-27 16:43
              
            
            
                                                                       
This solution worked for me:
$unserialized = unserialize(utf8_encode($st));

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
3
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复