Using FlatFileItemReader with a TaskExecutor (Thread Safety)

房东的猫 提交于 2019-12-05 22:20:31

问题


There are a lot of examples which use FlatFileItemReader along with TaskExecutor. I provide samples below (both with XML and Java Config):

  • Using Oracle Coherence with Spring Batch
  • Spring Batch Multithreading Example

I have used it my self with XML configuration for large CSVs (GB size) writing to database with the out-of-the-box JpaItemWriter. There seem to be no issues even without setting save-state = false or taking any kind of special handling.

Now, FlatFileItemReader is documented as not thread-safe.

My guess was that JpaItemWriter was "covering" the issue by persisting Sets i.e. collections with no duplicates if the hashCode() and equals() were covering the business key of the Entity. However, even this way it is not enough to prevent duplicates due to non-thread safe reading and processing.

Could you please clarify: is it proper/correct/safe to use the out-of-the-box FlatFileItemReader within a Tasklet which has assigned a TaskExecutor? Regardless of the Writer. If not, how could we explain in theory the lack of errors when a JPAItemWriter is used?

P.S: The example links that I give above, use FlatFileItemReader with TaskExecutor without mentioning at all possible thread-safety issues...


回答1:


TL;DR It is safe to use a FlatFileItemReader with a TaskExecutor provided the Writer is thread-safe. (Assuming that you are not concerned with restarting jobs, retrying steps, skipping, etc at the moment).

Update : There is now a JIRA that officially confirms that saveState needs to be set to false (i.e disable restartability) if one wants to use FlatFileItemReader with a TaskExecutor in a thread safe manner.


Let's first hear it from the horses mouth by seeing what the Spring documentation says about using multi-threaded steps with a TaskExecutor.

Spring Batch provides some implementations of ItemWriter and ItemReader. Usually they say in the Javadocs if they are thread safe or not, or what you have to do to avoid problems in a concurrent environment. If there is no information in Javadocs, you can check the implementation to see if there is any state

Let's address your questions now :

Could you please clarify: is it proper/correct/safe to use the out-of-the-box FlatFileItemReader within a Tasklet which has assigned a TaskExecutor? Regardless of the Writer. If not, how could we explain in theory the lack of errors when a JPAItemWriter is used?

The statement "Regardess of the writer" is incorrect. The Writer you use must be thread-safe. The JpaItemWriter is thread-safe accroding to the Java docs and can safely be used with a FlatFileItemReader that is not thread-safe. Explaining how JpaItemWriter is thread-safe would make this answer long. I recommend that you post another question if you are interested in how specific writers handle thread-safety. (As mentioned by the Spring Batch docs as well)

P.S: The example links that I give above, use FlatFileItemReader with TaskExecutor without mentioning at all possible thread-safety issues..

If you take a look at the coherence example, you will see that they clearly modify the CoherenceBatchWriter.java in Figure 6. They first make mapBatch local variable so that multiple threads have their own copy of this Map. Moreover, if you dig further into the Coherence API, you should find that the NamedCache being returned would be thread safe.

The second link that you provide looks really dicey since the Writer does not do anything to avoid race conditions. That example is indeed an incorrect use of a multi-threaded step.




来源:https://stackoverflow.com/questions/42270806/using-flatfileitemreader-with-a-taskexecutor-thread-safety

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!