Regex/ code to fix corrupt serialized PHP data.

前端 未结 12 680
夕颜
夕颜 2020-11-30 07:04

I have a massive multidimensional array that has been serialised by PHP. It has been stored in MySQL and the data field wasn\'t large enough... the end has been cut off... I

相关标签:
12条回答
  • 2020-11-30 07:24

    Serializing is almost always bad because you can't search it in any way. Sorry, but it seems as though you're backed into a corner...

    0 讨论(0)
  • 2020-11-30 07:31

    Best Solution for me:

    $output_array = unserialize(My_checker($serialized_string));

    code:

    function My_checker($serialized_string){
        // securities
        if (empty($serialized_string))                      return '';
        if ( !preg_match('/^[aOs]:/', $serialized_string) ) return $serialized_string;
        if ( @unserialize($serialized_string) !== false ) return $serialized_string;
    
        return
        preg_replace_callback(
            '/s\:(\d+)\:\"(.*?)\";/s', 
            function ($matches){  return 's:'.strlen($matches[2]).':"'.$matches[2].'";';  },
            $serialized_string )
        ;
    }
    
    0 讨论(0)
  • 2020-11-30 07:34

    Based on @Emil M Answer Here is a fixed version that works with text containing double quotes .

    function fix_broken_serialized_array($match) {
        return "s:".strlen($match[2]).":\"".$match[2]."\";"; 
    }
    $fixed = preg_replace_callback(
        '/s:([0-9]+):"(.*?)";/',
        "fix_broken_serialized_array",
        $serialized
    );
    
    0 讨论(0)
  • 2020-11-30 07:36

    This is recalculating the length of the elements in a serialized array:

    $fixed = preg_replace_callback(
        '/s:([0-9]+):\"(.*?)\";/',
        function ($matches) { return "s:".strlen($matches[2]).':"'.$matches[2].'";';     },
        $serialized
    );
    

    However, it doesn't work if your strings contain ";. In that case it's not possible to fix the serialized array string automatically -- manual editing will be needed.

    0 讨论(0)
  • 2020-11-30 07:36

    I have tried everything found in this post and nothing worked for me. After hours of pain here's what I found in the deep pages of google and finally worked:

    function fix_str_length($matches) {
        $string = $matches[2];
        $right_length = strlen($string); // yes, strlen even for UTF-8 characters, PHP wants the mem size, not the char count
        return 's:' . $right_length . ':"' . $string . '";';
    }
    function fix_serialized($string) {
        // securities
        if ( !preg_match('/^[aOs]:/', $string) ) return $string;
        if ( @unserialize($string) !== false ) return $string;
        $string = preg_replace("%\n%", "", $string);
        // doublequote exploding
        $data = preg_replace('%";%', "µµµ", $string);
        $tab = explode("µµµ", $data);
        $new_data = '';
        foreach ($tab as $line) {
            $new_data .= preg_replace_callback('%\bs:(\d+):"(.*)%', 'fix_str_length', $line);
        }
        return $new_data;
    }
    

    You call the routine as follows:

    //Let's consider we store the serialization inside a txt file
    $corruptedSerialization = file_get_contents('corruptedSerialization.txt');
    
    //Try to unserialize original string
    $unSerialized = unserialize($corruptedSerialization);
    
    //In case of failure let's try to repair it
    if(!$unSerialized){
        $repairedSerialization = fix_serialized($corruptedSerialization);
        $unSerialized = unserialize($repairedSerialization);
    }
    
    //Keep your fingers crossed
    var_dump($unSerialized);
    
    0 讨论(0)
  • 2020-11-30 07:39

    I doubt anyone would write code to retrieve partially saved arrays:) I fixed a thing like this once but by hand and it took hours, and then i realized i don't need that part of the array...

    Unless its really important data(and i mean REALLY important) you'd be better to leave this one go

    0 讨论(0)
提交回复
热议问题