PHP Using fgetcsv on a huge csv file

前端 未结 3 732
失恋的感觉
失恋的感觉 2021-01-06 07:58

Using fgetcsv, can I somehow do a destructive read where rows I\'ve read and processed would be discarded so if I don\'t make it through the wh

3条回答
  •  花落未央
    2021-01-06 08:38

    From your problem description it really sounds like you need to switch hosts. Processing a 2 GB file with a hard time limit is not a very constructive environment. Having said that, deleting read lines from the file is even less constructive, since you would have to rewrite the entire 2 GB to disk minus the part you have already read, which is incredibly expensive.

    Assuming you save how many rows you have already processed, you can skip rows like this:

    $alreadyProcessed = 42; // for example
    
    $i = 0;
    while ($row = fgetcsv($fileHandle)) {
        if ($i++ < $alreadyProcessed) {
            continue;
        }
    
        ...
    }
    

    However, this means you're reading the entire 2 GB file from the beginning each time you go through it, which in itself already takes a while and you'll be able to process fewer and fewer rows each time you start again.

    The best solution here is to remember the current position of the file pointer, for which ftell is the function you're looking for:

    $lastPosition = file_get_contents('last_position.txt');
    $fh = fopen('my.csv', 'r');
    fseek($fh, $lastPosition);
    
    while ($row = fgetcsv($fh)) {
        ...
    
        file_put_contents('last_position.txt', ftell($fh));
    }
    

    This allows you to jump right back to the last position you were at and continue reading. You obviously want to add a lot of error handling here, so you're never in an inconsistent state no matter which point your script is interrupted at.

提交回复
热议问题