Import Large CSV file into MySQL

徘徊边缘 提交于 2019-12-03 09:13:58
Lee

try optimising your scripts first. First off, never run single queries when importing unless you have no other choice, the network overhead can be a killer.

Try something like (obviously untested and coded in the SO textbox, check brackets match e.c.t.):

$url = 'http://www.example.com/directory/file.csv';
if (($handle = fopen($url, "r")) !== FALSE) 
{
fgetcsv($handle, 1000, ",");

$imports = array();

while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) 
{
    $EvID = $data[0];
    $Ev = $data[1];
    $PerID = $data[2];
    $Per = $data[3];
    $VName = $data[4];
    $VID = $data[5];
    $VSA = $data[6];
    $DateTime = $data[7];
    $PCatID = $data[8];
    $PCat = $data[9];
    $CCatID = $data[10];
    $CCat = $data[11];
    $GCatID = $data[12];
    $GCat = $data[13];
    $City = $data[14];
    $State = $data[15];
    $StateID = $data[16];
    $Country = $data[17];
    $CountryID = $data[18];
    $Zip = $data[19];
    $TYN = $data[20];
    $IMAGEURL = $data[21];
    $URLLink = $data[22];

        $data[7] = strtotime($data[7]);
        $data[7] = date("Y-m-d H:i:s",$data[7]);

    if((($PCatID == '2') && (($CountryID == '217') or ($CountryID == '38'))) || (($GCatID == '16') or ($GCatID == '19') or ($GCatID == '30') or ($GCatID == '32'))) 
    {

    $imports[] = "('".md5($EventID.$PerformerID)."','".addslashes($data[0])."','".addslashes($data[1])."','".addslashes($data[2])."','".addslashes($data[3])."','".addslashes($data[4])."',
                    '".addslashes($data[5])."','".addslashes($data[6])."','".addslashes($data[7])."','".addslashes($data[8])."','".addslashes($data[9])."',
                '".addslashes($data[10])."','".addslashes($data[11])."','".addslashes($data[12])."','".addslashes($data[13])."','".addslashes($data[14])."',
                    '".addslashes($data[15])."','".addslashes($data[16])."','".addslashes($data[17])."','".addslashes($data[18])."','".addslashes($data[19])."',
                '".addslashes($data[20])."','".addslashes($data[21])."')";



    }
}

$importarrays = array_chunk($imports, 100);
foreach($importarrays as $arr) {

 if(!mysql_query("INSERT IGNORE INTO TNDB_CSV2 
                (id, EvID, Event, PerID, Per, VName,
                     VID, VSA, DateTime, PCatID, PCat,                
                CCatID, CCat, GCatID, GCat, City,
                     State, StateID, Country, CountryID, Zip,
                TYN, IMAGEURL) VALUES ".implode(',', $arr)){

     die("error: ".mysql_error());

 }

 }

fclose($handle);
}

Play around with the number in array_chunk, too big and it may cause problems like the query being too long (yes there is a configurable limit in my.cnf), too small and its unneccassary overhead.

You could also drop the use of assign the $data[x] to variables as its a waste given how small the script is, just use the $data[x] directly in your query e.c.t. (wont give a massive improvement, but depending on your import size it could save a little).

Next thing would be to use low priority inserts/updates, check out this for more info on that to get you started: How to give priority to certain queries?

after all of that, you could think of mysql config optimisation's, but that's one for google to explain really as the best settings are different for everyone and their unique situations

Edit: Another thing i've done before is if you have a lot of keys set up that aren't required for the import, you can drop those keys temporarily and add them back when the script is done. This can yield good time improvements too, but as your working on a live database there are pitfalls to work around if you go down that route.

bpgergo

Try to do batch insert with using implode() function. For further explanation and example, see this thread insert multiple rows via a php array into mysql

I used this query

$sql = "
        LOAD DATA LOCAL INFILE 'uploads/{$fileName}'
        REPLACE INTO TABLE `order`
        FIELDS
            TERMINATED BY '\t'
        LINES
            TERMINATED BY '\r\n'
        IGNORE 1 LINES
        (product_id, `date`, quantity)
        ";

it's super fast

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!