问题
My Java application works on music files within folders, it is designed to process multiple folders in parallel and independently. To do this each folder is processed by an ExecutorService that has a maximum pool size that matches no of CPUs of the computer.
For example, if we have 8-CPU computer then eight folders can (in theory) be processed concurrently, if we have a 16-CPU computer then 16 folders can be processed concurrently. If we only have 1 CPU then we set pool-size to 3, to allow the CPU to continue doing something if one folder blocked on I/O.
However, we don't actually have just one ExecutorService we have more than one because each folder can go through a number of stages.
Process1 (uses ExecutorService1) → Process2 (ExecutorService2) → Process3 (ExecutorService3)
Process 1,2,3 etc all implements Callable and all have their own associated ExecutorService. There is a FileLoader process that we kick off and this loads folders and then create a Process1 callable for each folder and submits to Process1 executor, for each Process1 callable it will do its work and then submit to a different callable, this maybe Process2, Process3 ecetera but we never go backwards, e.g Process3 will never submit to Process1. We actually have 12 processes, but any particular folder is unlikeley to go through all 12 processes
But I realized that this is flawed because in the case of a 16-CPU computer each ES can have pool-size of 16, so we actually have 48 threads running and this will just lead too much contention.
So what I was going to do was have all processes (Process1, Process2…) use the same ExecutorService, that way we only ever worker threads matching CPUs.
However, in my current situation, we have a SongLoader process that has just one task submitted (loading of all folders) and we then call shutdown(), this won't complete until everything has been submitted to Process0, then shutdown() on Process0 won't succeed until everything sent to Process1 and so on.
//Init Services
services.add(songLoaderService);
services.add(Process1.getExecutorService());
services.add(Process2.getExecutorService());
services.add(Process3.getExecutorService());
for (ExecutorService service : services)
//Request Shutdown
service.shutdown();
//Now wait for all submitted tasks to complete
service.awaitTermination(10, TimeUnit.DAYS);
}
//...............
//Finish Off work
However, if everything was on same ES and Process1 was submitting to Process2 this would no longer work because at the time shutdown() was called not all folders that Process1 would have submitted to Process2 so it would be shut down prematurely.
So how do I detect when all work has been completed using a single ExecutorService when tasks on that ES can submit to other tasks on the same ES?
Or is there a better approach?
Note, you might just think why doesnt he just merge the logic of Process1,2 & 3 into a single Process. The difficulty is that although I initially I groups songs by folder, sometimes the songs gets split into smaller groups and they get allocated to seperate processes doiwn the line and not neessarily the same process, there are actually 12 processes in total.
Attempt based on Sholms idea
Main Thread
private static List<Future> futures = Collections.synchronizedList(new ArrayList<Future>());
private static AnalyserService analyserService = new MainAnalyserService(SongKongThreadGroup.THREAD_WORKER);
...
SongLoader loader = SongLoader.getInstanceOf(parentFolder);
ExecutorService songLoaderService = SongLoader.getExecutorService();
songLoaderService.submit(loader);
for(Future future : futures)
{
try
{
future.get();
}
catch (InterruptedException ie)
{
SongKong.logger.warning(">>>>>> Interrupted - shutting down tasks immediately");
getAnalyserService().getExecutorService().awaitTermination(30, TimeUnit.SECONDS);
}
catch(ExecutionException e)
{
SongKong.logger.log(Level.SEVERE, ">>>>>> ExecutionException:"+e.getMessage(), e);
}
}
songLoaderService.shutdown();
With Process code submitting new tasks using this function from MainAnalyserService
public void submit(Callable<Boolean> task) //throws Exception
{
FixSongsController.getFutures().add(getExecutorService().submit(task));
}
It looked like it was working but it failed with
java.util.ConcurrentModificationException
at java.base/java.util.ArrayList$Itr.checkForComodification(Unknown Source)
at java.base/java.util.ArrayList$Itr.next(Unknown Source)
at com.jthink.songkong.analyse.toplevelanalyzer.FixSongsController.start(FixSongsController.java:220)
at com.jthink.songkong.ui.swingworker.FixSongs.doInBackground(FixSongs.java:49)
at com.jthink.songkong.ui.swingworker.FixSongs.doInBackground(FixSongs.java:18)
at java.desktop/javax.swing.SwingWorker$1.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.desktop/javax.swing.SwingWorker.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
and I now releize I cannot hyave one thread calling future.get() (which waits until done), whilst at the same time other threads are adding to the list.
回答1:
I agree with Shloim that you don't need multiple ExecutorService
instances here -- just one (sized to the number of CPUs you have available) is sufficient and actually optimal. Actually, I think you might not need ExecutorService
; a simple Executor
can do the job if you use an external mechanism of signaling completeness.
I would start by building a class to represent the entirety of a larger work item. If you need to consume the results from each child work item, you could use a queue, but if you just want to know if there is work left to do, you only need a counter.
For example, you could do something like this:
public class FolderWork implements Runnable {
private final Executor executor;
private final File folder;
private int pendingItems; // guarded by monitor lock on this instance
public FolderWork(Executor executor, File folder) {
this.executor = executor;
this.folder = folder;
}
@Override
public void run() {
for (File file : folder.listFiles()) {
enqueueMoreWork(file);
}
}
public synchronized void enqueueMoreWork(File file) {
pendingItems++;
executor.execute(new FileWork(file, this));
}
public synchronized void markWorkItemCompleted() {
pendingItems--;
notifyAll();
}
public synchronized boolean hasPendingWork() {
return pendingItems > 0;
}
public synchronized void awaitCompletion() {
while (pendingItems > 0) {
wait();
}
}
}
public class FileWork implements Runnable {
private final File file;
private final FolderWork parent;
public FileWork(File file, FolderWork parent) {
this.file = file;
this.parent = parent;
}
@Override
public void run() {
try {
// do some work with the file
if (/* found more work to do */) {
parent.enqueueMoreWork(...);
}
} finally {
parent.markWorkItemCompleted();
}
}
}
If you're worried about synchronization overhead for the pendingItems
counter, you can use an AtomicInteger
for it instead. Then you need a separate mechanism for notifying a waiting thread that we are done; for example, you can use a CountDownLatch
. Here's an example implementation:
public class FolderWork implements Runnable {
private final Executor executor;
private final File folder;
private final AtomicInteger pendingItems = new AtomicInteger(0);
private final CountDownLatch latch = new CountDownLatch(1);
public FolderWork(Executor executor, File folder) {
this.executor = executor;
this.folder = folder;
}
@Override
public void run() {
for (File file : folder.listFiles()) {
enqueueMoreWork(file);
}
}
public void enqueueMoreWork(File file) {
if (latch.getCount() == 0) {
throw new IllegalStateException(
"Cannot call enqueueMoreWork() again after awaitCompletion() returns!");
}
pendingItems.incrementAndGet();
executor.execute(new FileWork(file, this));
}
public void markWorkItemCompleted() {
int remainingItems = pendingItems.decrementAndGet();
if (remainingItems == 0) {
latch.countDown();
}
}
public boolean hasPendingWork() {
return pendingItems.get() > 0;
}
public void awaitCompletion() {
latch.await();
}
}
You would call this like so:
Executor executor = Executors.newCachedThreadPool(...);
FolderWork topLevel = new FolderWork(executor, new File(...));
executor.execute(topLevel);
topLevel.awaitCompletion();
This example only shows one level of child work items, but you can use any number of child work items as long as they all use the same pendingItems
counter to keep track of how much work is left to do.
回答2:
Do not shutdown()
the ExecutorService
. Instead, create Callable
objects and keep the Future
objects that they create.
Now you can wait on the Future
objects instead of waiting on the ExecutorService
. Note that now you will have to wait on every future object separately, but if you only need to know when the last one finishes, then you can just as well iterate on them at any given order and call get()
.
Any task can submit more tasks and needs to make sure to put its future object in a queue that will be monitored by your main thread.
// put these somewhere public
ConcurrentLinkedQueue<Future<Boolean>> futures = new ConcurrentLinkedQueue<Future<Boolean>>();
ExecutorService executor = ...
void submit(Callable<Boolean> c) {
futures.add(executor.submit(c));
}
Now your main thread can start submitting tasks and wait for all tasks and subtasks:
void mainThread() {
// add some tasks from main thread
for(int i=0 ; i<N ; ++i){
Callable<Boolean> callable = new Callable<Boolean>() {
@Override
public Boolean call() throws Exception {
...
}
submit(callable);
}
Future<Boolean> head = null;
while((head=futures.poll()) != null){
try {
head.get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
}
// At this point, all of your tasks are complete including subtasks.
executor.shutdown();
executor.awaitTermination(); // should return almost immediately
}
回答3:
This is essentally @DanielPrydens solution, but I have massaged it a little just so it more clearly shows how to solve my particular issue
Created a new class MainAnalyserService that handles creation of the ExecutorService and provides the ability to count when new Callable tasks are submitted and when they have completed
public class MainAnalyserService
{
public static final int MIN_NUMBER_OF_WORKER_THREADS = 3;
protected static int BOUNDED_QUEUE_SIZE = 100;
private final AtomicInteger pendingItems = new AtomicInteger(0);
private final CountDownLatch latch = new CountDownLatch(1);
private static final int TIMEOUT_PER_TASK = 30;
protected ExecutorService executorService;
protected String threadGroup;
public MainAnalyserService(String threadGroup)
{
this.threadGroup=threadGroup;
initExecutorService();
}
protected void initExecutorService()
{
int workerSize = Runtime.getRuntime().availableProcessors();
//Even if only have single cpu we still have multithread so we dont just have single thread waiting on I/O
if(workerSize< MIN_NUMBER_OF_WORKER_THREADS)
{
workerSize = MIN_NUMBER_OF_WORKER_THREADS;
}
executorService = new TimeoutThreadPoolExecutor(workerSize,
new SongKongThreadFactory(threadGroup),
new LinkedBlockingQueue<Runnable>(BOUNDED_QUEUE_SIZE),
TIMEOUT_PER_TASK,
TimeUnit.MINUTES);
}
public void submit(Callable<Boolean> task) //throws Exception
{
executorService.submit(task);
pendingItems.incrementAndGet();
}
public void workDone()
{
int remainingItems = pendingItems.decrementAndGet();
if (remainingItems == 0)
{
latch.countDown();
}
}
public void awaitCompletion() throws InterruptedException{
latch.await();
}
}
In the FixSongsController thread we have
analyserService = new MainAnalyserService(THREAD_WORKER);
//SongLoader uses CompletionService when calls LoadFolderWorkers so shutdown wont return until all initial folder submissions completed
ExecutorService songLoaderService = SongLoader.getExecutorService();
songLoaderService.submit(loader);
songLoaderService.shutdown();
//Wait for all aysnc tasks to complete
analyserService.awaitCompletion();
Then any Callable (such as Process1, Process2 etc) calls submit() to submit a new Callable on the ExecutorService, and then it must call workDone() when it has completed, so to ensure I do this I add to a finally block in the call() of each Process class method
e.g
public Boolean call()
{
try
{
//do stuff
//Possibly make multiple calls to
FixSongsController.getAnalyserService().submit();
}
finally
{
FixSongsController.getAnalyserService().workDone();
}
}
来源:https://stackoverflow.com/questions/56617083/how-do-i-know-when-executorservice-has-finished-if-items-on-the-es-can-resubmit