Using “IN” in a WHERE clause where the number of items in the set is very large

前端 未结 10 2396
梦毁少年i
梦毁少年i 2021-02-13 05:30

I have a situation where I need to do an update on a very large set of rows that I can only identify by their ID (since the target records are selected by the user and have noth

10条回答
  •  花落未央
    2021-02-13 06:00

    In general there are several things to consider.

    1. The statement parsing cache in the DB. Each statement, with a different number of items in the IN clause, has to be parsed separately. You ARE using bound variables instead of literals, right?
    2. Some Databases have a limit on the number of items in the IN clause. For Oracle it's 1000.
    3. When updating you lock records. If you have multiple separate update statements you can have deadlocks. This means you have to be careful about the order in which you issue your updates.
    4. Round-trip latency to the database can be high, even for a very fast statement. This means it's often better to manipulate lots of records at once to save trip-time.

    We recently changed our system to limit the size of the in-clauses and always use bound variables because this reduced the number of different SQL statements and thus improved performance. Basically we generate our SQL statements and execute multiple statements if the in-clause exceeds a certain size. We don't do this for updates so we haven't had to worry about the locking. You will.

    Using a temp table may not improve performance because you have to populate the temp table with the IDs. Experimentation and performance tests can tell you the answer here.

    A single IN clause is very easy to understand and maintain. This is probably what you should worry about first. If you find that the performance of the queries is poor you might want to try a different strategy and see if it helps, but don't optimize prematurely. The IN-clause is semantically correct so leave it alone if it isn't broken.

提交回复
热议问题