Merging file chunks in PHP

后端 未结 1 978
盖世英雄少女心
盖世英雄少女心 2021-02-10 01:35

For the educational purposes, I wanted to create file chunks upload. How do you guys know when all of the chunks are uploaded?

I tried to move chunks from temp

1条回答
  •  生来不讨喜
    2021-02-10 02:01

    Sorry for my previous comments, I misunderstood a question. This quiestion is interesting and fun to play with.

    The expression you are looking for is this:

    $target_path = ROOT.'/upload/';
    
    $tmp_name = $_FILES['upload']['tmp_name'];
    $filename = $_FILES['upload']['name'];
    $target_file = $target_path.$filename;
    $num = $_POST['num'];
    $num_chunks = $_POST['num_chunks'];
    
    move_uploaded_file($tmp_name, $target_file.$num);
    
    // count ammount of uploaded chunks
    $chunksUploaded = 0;
    for ( $i = 1, i <= $num; $i++ ) {
        if ( file_exists( $target_file.$i ) ) {
             ++$chunksUploaded;
        }
    }
    
    // and THAT's what you were asking for
    // when this triggers - that means your chunks are uploaded
    if ($chunksUploaded === $num_chunks) {
    
        /* here you can reassemble chunks together */
        for ($i = 1; $i <= $num_chunks; $i++) {
    
          $file = fopen($target_file.$i, 'rb');
          $buff = fread($file, 2097152);
          fclose($file);
    
          $final = fopen($target_file, 'ab');
          $write = fwrite($final, $buff);
          fclose($final);
    
          unlink($target_file.$i);
        }
    }
    

    And this must be mentioned:

    Point of fragility of my version - is when you expect files

    • 'tmp-1',

    • 'tmp-2',

    • 'tmp-3'

    but, let's assume that after sending 'tmp-2' we were interrupted - that tmp-2 pollutes tmp folder, and it will interfere with future uploads with the same filename - that would be a sleeping bomb.

    To counter that - you must find a way to change tmp to something more original.

    • 'tmp-ABCew-1',

    • 'tmp-ABCew-2',

    • 'tmp-ABCew-3'

    is a bit better - where 'ABCew' could be called 'chunksSessionId' - you provide it when sending your POST, you make it randomly. Still, collisions are possible - as space of random names depletes. You could add time to equation - for example - you can see that

    • 'tmp-ABCew-2016-03-17-00-11-22--1',

    • 'tmp-ABCew-2016-03-17-00-11-22--2',

    • 'tmp-ABCew-2016-03-17-00-11-22--3'

    Is much more collision-resistant but it is difficult to implement - a whole can of worms here - client date and time is controlled by client and could be spoofed - this data is unreliable.

    So making tmp-name unique is a complex task. Designing a system that makes it reliable - is an interesting problem ^ ^ You can play with that.

    0 讨论(0)
提交回复
热议问题