serialize a large array in PHP?

前端 未结 13 1944
死守一世寂寞
死守一世寂寞 2021-01-04 06:11

I am curious, is there a size limit on serialize in PHP. Would it be possible to serialize an array with 5,000 keys and values so it can be stored into a cache?

I am

相关标签:
13条回答
  • 2021-01-04 06:28

    I think better than serialize is json_encode function. It got a drawback, that associative arrays and objects are not distinguished, but string result is smaller and easier to read by human, so also to debug and edit.

    0 讨论(0)
  • 2021-01-04 06:28

    I've just come across an instance where I thought I was hitting an upper limit of serialisation.

    I'm persisting serialised objects to a database using a mysql TEXT field.

    The limit of the available characters for a single-byte characters is 65,535 so whilst I can serialize much larger objects than that with PHP It's impossible to unserialize them as they are truncated by the limit of the TEXT field.

    0 讨论(0)
  • 2021-01-04 06:29

    As quite a couple other people answered already, just for fun, here's a very quick benchmark (do I dare calling it that ? ) ; consider the following code :

    $num = 1;
    
    $list = array_fill(0, 5000, str_repeat('1234567890', $num));
    
    $before = microtime(true);
    for ($i=0 ; $i<10000 ; $i++) {
        $str = serialize($list);
    }
    $after = microtime(true);
    
    var_dump($after-$before);
    var_dump(memory_get_peak_usage());
    

    I'm running this on PHP 5.2.6 (the one bundled with Ubuntu jaunty).
    And, yes, there are only values ; no keys ; and the values are quite simple : no object, no sub-array, no nothing but string.

    For $num = 1, you get :

    float(11.8147978783)
    int(1702688)
    

    For $num = 10, you get :

    float(13.1230671406)
    int(2612104)
    

    And, for $num = 100, you get :

    float(63.2925770283)
    int(11621760)
    

    So, it seems the bigger each element of the array is, the longer it takes (seems fair, actually). But, for elements 100 times bigger, you don't take 100 times much longer...


    Now, with an array of 50000 elements, instead of 5000, which means this part of the code is changed :

    $list = array_fill(0, 50000, str_repeat('1234567890', $num));
    

    With $num = 1, you get :

    float(158.236332178)
    int(15750752)
    

    Considering the time it took for 1, I won't be running this for either $num = 10 nor $num = 100...


    Yes, of course, in a real situation, you wouldn't be doing this 10000 times ; so let's try with only 10 iterations of the for loop.

    For $num = 1 :

    float(0.206310987473)
    int(15750752)
    

    For $num = 10 :

    float(0.272629022598)
    int(24849832)
    

    And for $num = 100 :

    float(0.895547151566)
    int(114949792)
    

    Yeah, that's almost 1 second -- and quite a bit of memory used ^^
    (No, this is not a production server : I have a pretty high memory_limit on this development machine ^^ )


    So, in the end, to be a bit shorter than those number -- and, yes, you can have numbers say whatever you want them to -- I wouldn't say there is a "limit" as in "hardcoded" in PHP, but you'll end up facing one of those :

    • max_execution_time (generally, on a webserver, it's never more than 30 seconds)
    • memory_limit (on a webserver, it's generally not muco more than 32MB)
    • the load you webserver will have : while 1 of those big serialize-loop was running, it took 1 of my CPU ; if you are having quite a couple of users on the same page at the same time, I let you imagine what it will give ;-)
    • the patience of your user ^^

    But, except if you are really serializing long arrays of big data, I am not sure it will matter that much...
    And you must take into consideration the amount of time/CPU-load using that cache might help you gain ;-)

    Still, the best way to know would be to test by yourself, with real data ;-)


    And you might also want to take a look at what Xdebug can do when it comes to profiling : this kind of situation is one of those it is useful for!

    0 讨论(0)
  • 2021-01-04 06:33

    Ok... more numbers! (PHP 5.3.0 OSX, no opcode cache)

    @Pascal's code on my machine for n=1 at 10k iters produces:

    float(18.884856939316)
    int(1075900)
    

    I add unserialize() to the above as so.

    $num = 1;
    
    $list = array_fill(0, 5000, str_repeat('1234567890', $num));
    
    $before = microtime(true);
    for ($i=0 ; $i<10000 ; $i++) {
        $str = serialize($list);
        $list = unserialize($str);
    }
    $after = microtime(true);
    
    var_dump($after-$before);
    var_dump(memory_get_peak_usage());
    

    produces

    float(50.204112052917)
    int(1606768) 
    

    I assume the extra 600k or so are the serialized string.

    I was curious about var_export and its include/eval partner $str = var_export($list, true); instead of serialize() in the original produces

    float(57.064643859863)
    int(1066440)
    

    so just a little less memory (at least for this simple example) but way more time already.

    adding in eval('$list = '.$str.';'); instead of unserialize in the above produces

    float(126.62566018105)
    int(2944144)
    

    Indicating theres probably a memory leak somewhere when doing eval :-/.

    So again, these aren't great benchmarks (I really should isolate the eval/unserialize by putting the string in a local var or something, but I'm being lazy) but they show the associated trends. var_export seems slow.

    0 讨论(0)
  • 2021-01-04 06:42

    The serialize() function is only limited by available memory.

    0 讨论(0)
  • 2021-01-04 06:42

    There's no limit enforced by PHP. Serialize returns a bytestream representation (string) of the serialized structure, so you would just get a large string.

    0 讨论(0)
提交回复
热议问题