Tutorial about Using multi-threading in jdbc

前端 未结 7 816
别跟我提以往
别跟我提以往 2020-12-24 00:54

Our company has a Batch Application which runs every day, It does some database related jobs mostly, import data into database table from file for example.

There are

相关标签:
7条回答
  • 2020-12-24 00:59

    Multi-threading will improve your performance but there are a couple of things you need to know:

    1. Each thread needs its own JDBC connection. Connections can't be shared between threads because each connection is also a transaction.
    2. Upload the data in chunks and commit once in a while to avoid accumulating huge rollback/undo tables.
    3. Cut tasks into several work units where each unit does one job.

    To elaborate the last point: Currently, you have a task that reads a file, parses it, opens a JDBC connection, does some calculations, sends the data to the database, etc.

    What you should do:

    1. One (!) thread to read the file and create "jobs" out of it. Each job should contains a small, but not too small "unit of work". Push those into a queue
    2. The next thread(s) wait(s) for jobs in the queue and do the calculations. This can happen while the threads in step #1 wait for the slow hard disk to return the new lines of data. The result of this conversion step goes into the next queue
    3. One or more threads to upload the data via JDBC.

    The first and the last threads are pretty slow because they are I/O bound (hard disks are slow and network connections are even worse). Plus inserting data in a database is a very complex task (allocating space, updating indexes, checking foreign keys)

    Using different worker threads gives you lots of advantages:

    1. It's easy to test each thread separately. Since they don't share data, you need no synchronization. The queues will do that for you
    2. You can quickly change the number of threads for each step to tweak performance
    0 讨论(0)
  • 2020-12-24 01:03

    I am not all that familiar with JDBC but regarding the multithreading bit of your question, what you should keep in mind is that parallel processing relies on effectively dividing your problem into bits that are independent of one another and in some way putting them back together (their output that is). If you dont know the underlying dependencies between tasks you might end up having really odd errors/exceptions in your code. Even worse, it might all execute without any problems, but the results might be off from true values. Multi-threading is tricky business, in a way fun to learn (at least I think so) but pain in the neck when things go south.

    Here are a couple of links that might provide useful:

    • Oracle's java trail: best place to start
    • A good tutorial for java concurrency
    • an interesting article on concurrency

    If you are serious about putting effort to getting into multi-threading I can recommend GOETZ, BRIAN: JAVA CONCURRENCY, amazing book really..

    Good luck

    0 讨论(0)
  • 2020-12-24 01:06

    The fastest way I've found to insert large numbers of records into Oracle is with array operations. See the "setExecuteBatch" method, which is specific to OraclePreparedStatement. It's described in one of the examples here: http://betteratoracle.com/posts/25-array-batch-inserts-with-jdbc

    0 讨论(0)
  • 2020-12-24 01:09

    Multi threading may be of help, if the lines are uncorrelated, you may start off two processes one reading even lines, another uneven lines, and get your db connection from a connection pool (dbcp) and analyze performance. But first I would investigate whether jdbc is the best approach normally databases have optimized solutions for imports like this. These solutions may also temporarily switch of constraint checking of your table, and turn that back on later, which is also great for performance. As always depending on your requirements.

    Also you may want to checkout springbatch which is designed for batch processing.

    0 讨论(0)
  • 2020-12-24 01:17

    I had a similar task. But in my case, all the tables were unrelated to each other.

    STEP1: Using SQL Loader(Oracle) for uploading data into database(very fast) OR any similar bulk update tools for your database.

    STEP2: Running each uploading process in a different thread(for unrelated tasks) and in a single thread for related tasks.

    P.S. You could identify different inter-related jobs in your application and categorize them in groups; and running each group in different threads.

    Links to run you up:

    JAVA Threading follow the last example in the above link(Example: Partitioning a large task with multiple threads)

    SQL Loader can dramatically improve performance

    0 讨论(0)
  • 2020-12-24 01:22

    As far as I know,the JDBC Bridge uses synchronized methods to serialize all calls to ODBC so using mutliple threads won't give you any performance boost unless it boosts your application itself.

    0 讨论(0)
提交回复
热议问题