I need to process about 250.000 documents per day with an EJB 3.1 asynchronous method in order to face an overall long time task.
I do this to use more threads and process more documents concurrently. Here's an example in pseudo code:
// this returns about 250.000 documents per day
List<Document> documentList = Persistence.listDocumentsToProcess();
for(Document currentDocument: documentList){
//this is the asynchronous call
ejbInstance.processAsynchronously(currentDocument);
}
Suppose I have a thread pool of size 10 and 4 core processors, my questions are:
- how many documents will the application server process SIMULTANEOUSLY?
- what happen when all thread in pool are processing a documents and one more asynchronous call comes? Will this work like a sort of JMS Queue?
- would I have any improvement adopting a JMS Queue solution
I work with Java EE 6 and WebSphere 8.5.5.2
The default configuration for asynchronous EJB method calls is as follows (from the infocenter):
The EJB container work manager has the following thread pool settings:
Minimum number of threads = 1 Maximum number of threads = 5 Work request queue size = 0 work objects Work request queue full action = Block Remote Future object duration = 86400 seconds
So trying to answer your questions:
how many documents will the application server process SIMULTANEOUSLY? (assuming 10 size thread pool)
This thread pool is for all EJB async calls, so first you need to assume that your application is the only one using EJB async calls. Then you will potentially have 10 runnable instances, that will be processed in parallel. Whether they will be processed concurrently depends on the number of cores/threads available in the system, so you cant have accurate number (some cores/threads may be doing web work for example, or other process using cpu).
what happen when all thread in pool are processing a documents and one more asynchronous call comes?
It depends on the Work request queue size
and Work request queue full action
, settings. If there are no available threads in the pool, then requests will be queued till the queue size is reached. Then it depends on the action, which might be Block
or Fail
.
would I have any improvement adopting a JMS Queue solution
Depends on your needs. Here are some pros/cons JMS solution.
Pros:
- Persistence - if using JMS your asynchronous task can be persistent, so in case of the server failure you will not lost them, and will be processed after restart or by other cluster member. EJB async queue is held only in memory, so tasks in queue are lost in case of failure.
- Scalability - if you put tasks to the queue, they might be concurrently processed by many servers in the cluster, not limited to single JVM
- Expiration and priorities - you can define different expiration time or priorities for your messages.
Cons:
- More complex application - you will need to implement MDB to process your tasks.
- More complex infrastructure - you will need database to store the queues (file system can be used for single server, and shared filesystem can be used for clusters), or external messaging solution like WebSphere MQ
- a bit lower performance for processing single item and higher load on server, as it will have to be serialized/deserialized to persistent storage
来源:https://stackoverflow.com/questions/27026458/ejb-3-1-asynchronous-method-and-thread-pool