How to improve speed of thousand of inserts in a Nested Set model?

问题

I'm using MySQL. I want to insert 100.000 places who belong to a Nested Set Model with different levels (root_id, lft, rgt, level). If I didn't have to worry about the tree, I would do a bulk insert; but in this case, I have to do 3 queries for each object on this order:

foreach ( $places as $place )
{
    # get parent place from $ancestor_array ( woeid => ids )
    $parent = $ancestors[:parent_woeid]

    update place set lft = lft + 2 where root_id = :root_id and lft >= :obj_lft;
    update place set rgt = rgt + 2 where root_id = :root_id and rgt >= :obj_lft;
    insert into place (..., root_id, lft, rgt, level, ...) values (..., :obj_root_id, :obj_lft, :obj_rgt, :obj_level, ...);

    ...
}

That takes a long time... so it's time to try to be smarter. This is what I'm thinking:

foreach ( $places as $place )
{
    # get parent place from $ancestor_array ( woeid => ids )
    $parent = $ancestors[:parent_woeid]

    $new_admins[] = array('parent_woeid' => :parent_woeid, ...data place....)
    $woeids[] = :parent_woeid;
}

# lock + bulk insert of $new_admins with lft=rgt=level=root_id= null + unlock
    insert into place (...) values (...), (...), (...), ...., (...)

# get ids of those places
select from place where woeid in(:array_woeids)

# lock + bulk update with 3 updates per place + unlock
    update place set lft= lft + 2 where root_id = :root_id and lft >= :obj_lft;
    update place set rgt = rgt + 2 where root_id = :root_id and rgt >= :obj_lft;
    update place set lft=:obj_lft, rgt=:obj_rgt where id=:id

# i have to update the ancestors lft & rgt values on the $ancestors_array, 
# because we have them "cached" in the array in order to avoid doing a
# select per inserted place.

What do you think? How would you do it? Would you save all the inserts and updates into a file and use the LOAD DATA INFILE syntax instead of this option?

Is there any other option I'm missing?

thanks a lot!

回答1:

Wouldn't it be more efficient to just insert the items and recalculate all the lft/rgt indexes only when all items are inserted?

However to do that you have to know parent ID of each item, so you have something to base the indexes on. I personally had success keeping lft + lrg + parentId with each data item, it makes maintenance much easier (you can do trivial recalculation/integrity check at any time).

来源：https://stackoverflow.com/questions/5352698/how-to-improve-speed-of-thousand-of-inserts-in-a-nested-set-model

标签

mysql

performance

bulkinsert