Pyodbc + sqlalchemy fails for more than 2100 items

前端 未结 1 1575
盖世英雄少女心
盖世英雄少女心 2021-01-14 10:39

In the below code, an error is thrown when employee_code_list is longer than 2000 items as mentioned below. However, it works perfectly when the list is under 2000 items.

相关标签:
1条回答
  • 2021-01-14 11:06

    Your query is basically forcing SQLAlchemy to emit a query with 2000+ parameters (SELECT * WHERE Y IN (list of 2000+ values)). Different RDBMS's (and different drivers) have limits on the number of parameters you may have.

    Although your stack trace doesn't cover the exact error, I notice that you're using SQL Server and the numbers you're talking about are suspiciously close to a 2100 parameter limit SQL Server imposes under certain circumstances (see Parameters per user-defiend function on this Microsoft knowledge article). I would be willing to bet that this is what you're running into.

    The easiest approach you can take is to simply run your query in batches for each, say, 1000 items in employee_code_list:

    results = []
    batch_size = 1000
    batch_start = 0
    
    while batch_start < len(employee_code_list):
        batch_end = batch_start + batch_size
        employee_code_batch = employee_code_list[batch_start:batch_end]
        query = session.query(TblUserEmployee, TblUser).filter(
                    and_(
                        (TblUser.UserId == TblUserEmployee.EmployeeId),
                        (func.lower(TblUserEmployee.EmployeeCode).in_(employee_code_batch)),
                        (TblUser.OrgnId == MIG_CONSTANTS.context.organizationid),
                        (TblUser.UserTypeId == user_type)
                    ))
        results.append(query.all())
        batch_start += batch_size
    

    In this example we're creating an empty results list that we will append each batch of results to. We're setting a batch size of 1000 and a start position of 0 (the first item in employee_code_list). We're then running your query for each batch of 1000, and appending the results to results, until there are no records left to query in employee_code_list.

    There are other approaches of course, but this is one that won't require you to use a different RDBMS, and might be easiest to work into your code.

    0 讨论(0)
提交回复
热议问题