Build a batch query for MySQL insert each 1000 items

痴心易碎 提交于 2019-12-25 04:52:26

问题


I need to perform a batch insert in MySQL/MariaDB but since data is dynamic I need to build the proper SQL query. In a few steps:

  • I should find whether the current row exists or not in table - this is the first SELECT inside the loop
  • Right now I have 1454 but have to insert around 150k later, is better a batch query than 150k INSERT per item on the loop
  • If record already exists I should update it if doesn't then I should insert ,I just not care about UPDATE yet and the code you're seeing is only for INSERT

So here is what I am doing:

// Get values from Csv file as an array of values
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";

$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";

// Processing on each row of data
foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    echo "DEBUG: ", $sql, "\n";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {
        $rows_returned = $rs->num_rows;

        $veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
        $first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
        $last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
        $email = "'".$conn->real_escape_string($row['Email'])."'";
        $username = "'".$conn->real_escape_string($row['Username'])."'";
        $display_name = "'".$conn->real_escape_string(
                ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
            )."'";

        // VALUES should be added only if row doesn't exists
        if ($rows_returned === 0) {

            // VALUES should be append until they reach 1000
            while ($i % 1000 !== 0) {
                $sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW())";
                ++$i;;
            }

            // QUERY should be output to console to see if it's right or something is wrong
            echo "DEBUG: ", $sqlInsert, "\n";

            // QUERY should be executed if there are 1000 VALUES ready to add as a batch

            /*$rs = $conn->query($sqlInsert);

            if ($rs === false) {
                echo 'Wrong SQL: '.$sqlInsert.' Error: '.$conn->error, E_USER_ERROR;*/
            }
        } else {
            // UPDATE
            echo "UPDATE";
        }
    }
}

But this line of code: echo "DEBUG: ", $sql, "\n"; is not outputting nothing to console. I must be doing something wrong but I can't find what. Can any help me to build the proper batch query and to execute it each 1000 values append?

Proper output should be:

DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008ReolAAC'
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='005800000039SIWAA2'
....
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES(...), VALUES(...), VALUES(...)

Obtained result:

DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RGg6AAG'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RQ4CAAW'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
.... // until reach 1454 results

The table is empty so it should never goes through ELSE condition (UPDATE one).

EDIT

With help from the answer this is how the code looks now:

$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";

$i = 1;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";

foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {
        $rows_returned = $rs->num_rows;

        $veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
        $first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
        $last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
        $email = "'".$conn->real_escape_string($row['Email'])."'";
        $username = "'".$conn->real_escape_string($row['Username'])."'";
        $display_name = "'".$conn->real_escape_string(
                ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
            )."'";

        if ($rows_returned === 0) {
            if ($i % 1000 === 0) {
                file_put_contents("output.log", $sqlInsert."\n", FILE_APPEND);
                $sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
            } else {
                $sqlInsert .= "($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW()), ";
            }

            $i++;
        } else {
            echo "UPDATE";
        }
    }
}

But still buggy because:

  • I have got a first empty INSERT query: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES
  • I have got a second INSERT query with 1000 VALUES() append, but what happened with the rest? The remaining 454?

Can any give me another tip? Help?


回答1:


consider using INSERT IGNORE INTO table to check if the record already exists. How to 'insert if not exists' in MySQL? if you haven't already done so, make veeva_rep_id a PRIMARY key so the INSERT IGNORE will work

also check out using PDO for transactions, prepared statements and dynamically generating queries using PDO PDO Prepared Inserts multiple rows in single query

<?php

$sql = 'INSERT IGNORE INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES ';

$insertQuery = array();
$insertData = array();

/*

assuming the array from the csv is like this

$data = array(
    0 => array('name' => 'Robert', 'value' => 'some value'),
    1 => array('name' => 'Louise', 'value' => 'another value')
);
*/

foreach ($data as $row) {
    $insertQuery[] = '(:veeva_rep_id' . $n . ', :first' . $n . ', :last' . $n . ', :email' . $n . ', :username' . $n . ', :lastLoginAt' . $n . ', :lastSyncAt' . $n . ', :display_name' . $n . ', :rep_type' . $n . ', :avatar_url' . $n . ', :createdAt' . $n . ', :updatedAt' . $n . ')';
    $insertData['veeva_rep_id' . $n] = $row['name'];
    $insertData['first' . $n] = $row['value'];
    $insertData['last' . $n] = $row['name'];
    $insertData['email' . $n] = $row['value'];
    $insertData['username' . $n] = $row['name'];
    $insertData['lastLoginAt' . $n] = $row['value'];
    $insertData['lastSyncAt' . $n] = $row['value'];
    $insertData['display_name' . $n] = $row['name'];
    $insertData['rep_type' . $n] = $row['value'];
    $insertData['avatar_url' . $n] = $row['value'];
    $insertData['createdAt' . $n] = $row['name'];
    $insertData['updatedAt' . $n] = $row['value'];

    $n++;
}

$db->beginTransaction();

if (!empty($insertQuery) and count($insertQuery)>1000) {
    $sql .= implode(', ', $insertQuery);

    $stmt = $db->prepare($sql);
    $stmt->execute($insertData);
}

$db->commit();

print $sql . PHP_EOL;

let me know if it helps.




回答2:


You should have something like:

// Try fetching data from table 1

// If there is no record available, then fetch some data from table 2
// and insert that data inito table 1

You just wrote

$sql = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";

// Processing on each row of data
foreach ($data as $row) {

But from an insert no data is selected and second...you didn't run a select, where comes $data from?

update Use if ($i % 1000 === 0) { instead of while ($i % 1000 !== 0) {

$i         = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,...) ";

// Processing on each row of data
foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    echo "DEBUG: ", $sql, "\n";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {

        $veeva_rep_id = ...;
        $first = ...;
        $last = ...;
        $email = ...;
        // ...

        // VALUES should be added only if row doesn't exists
        if($rs->num_rows == 0) {
            // Insert some data
            $i++;

            if ($i % 1000 === 0) {
                echo "DEBUG: ", $sqlInsert, "\n";
                // execSql($sqlInsert);
                $sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,...) "; // reset
            } else {
                $sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,...) ";
            }
        } else {
            echo "UPDATE";
        }
    }
}



回答3:


Since it looks like you are trying to load data from a CSV file, you might want to consider using LOAD DATA INFILE functionality which is designed specifically for this purpose.

Here is link to documentation: https://dev.mysql.com/doc/refman/5.6/en/load-data.html



来源:https://stackoverflow.com/questions/31052225/build-a-batch-query-for-mysql-insert-each-1000-items

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!