I have an application that is unresponsive and seems to be in a deadlock or something like a deadlock. See the two threads below. Notice that the My-Thread@101c
Some thread (I assume My-Thread@101c
) is synchronized
on your TransactionalSystemImpl instance. The UI thread is trying to enter executeImpl
but is blocked on the synchronized
monitor and cannot. Where else is the TransactionalSystemImpl instance being used (with synchronized
entry)? Probably between
at com.acme.ui.ViewBuilder.renderOnEDT(ViewBuilder.java:157)
.
.
.
at com.acme.util.Job.run(Job.java:425)
invokeAndWait is not allowed from EDT and should through exception. But looking at stacktrace it looks like because you are using your wrapper thread its allowing you to call invokeAndWait but thats not right. Changing it to SwingUtilities.invokeLater should fix this problem.
Alternate solution: if its worth you can also look into SwingWroker class for worker threads. Here is the link:
http://docs.oracle.com/javase/tutorial/uiswing/concurrency/worker.html
Just adding some further information about deadlock: The Javadoc of invokeAndWait clearly mentions "This will happen after all pending events are processed." That includes the current event is calling invokeAndWait. invokeAndWait will wait for the current event to finish, and the current event wait for invokeAndWait to finish. That's a guaranteed deadlock, and that's why it isn't allowed.
I suspect the line 134 you quoted is not the real line 134 (can be caused by stale code, or some other issues). It seems that 134 is waiting for a monitor, which most probably means synchronized(pendingEntries)
, (or the clock.latch()
which I think it is some kind of countdown latch?)
From the stack trace, the AWT event dispatching thread is waiting for a monitor, which is held by MyThread.
Please check the code base on the stack trace of MyThread. I believe somewhere it sync on pendingEntries
, then it used invokeAndWait
to ask the event dispatching thread to do something, and in turn the event dispatching thread is waiting for pendingEntries
, which caused the deadlock.
A suggestion that is a bit off topics: Your event dispatching thread seems doing a lot more than it should. I don't think doing those transaction handling etc in the event dispatching thread is a good choice. Such action can be slow (and in this case, even blocks the event dispatching thread), which will cause UI to be unresponsive.
Splitting such action to a separate thread/executor seems a better choice for me.
If there are no other deadlocks running around, you can transform the call to EventQueue.invokeLater(Runnable)
into a blocking version that waits until your Runnable
is completed:
if (EventQueue.isDispatchThread()) r.run();
else {
final Lock lock = new ReentrantLock();
final AtomicBoolean locked = new AtomicBoolean(true);
final Condition condition = lock.newCondition();
EventQueue.invokeLater(() -> {
r.run();
try {
lock.lock();
locked.set(false);
condition.signalAll();
} finally {
lock.unlock();
}
});
try {
lock.lock();
while (locked.get())
condition.await();
} finally {
lock.unlock();
}
}
It seems to be well known among Swing developers of my acquaintance that invokeAndWait
is problematic, but maybe this isn't as well known as I had thought. I seem to recall having seen stern warnings in the documentation about difficulties in using invokeAndWait
properly, yet I'm having a hard time finding anything. I cannot find anything in current, official documentation. The only thing I've been able to find is this line from an old version of the Swing Tutorial from 2005: (web archive)
If you use
invokeAndWait
, make sure that the thread that calls invokeAndWait does not hold any locks that other threads might need while the call is occurring.
Unfortunately this line seems to have disappeared from the current Swing tutorial. Even this is rather an understatement; I'd have preferred that it say something like, "If you use invokeAndWait
, the thread that calls invokeAndWait
must not hold any locks that other threads might need while the call is occurring." In general it's difficult to know what locks other threads might need during any given time, the safest policy is probably to ensure that the thread calling invokeAndWait
doesn't hold any locks at all.
(This is pretty difficult to do, and it's why I said above that invokeAndWait
is problematic. I also know that the designers of JavaFX -- essentially a Swing replacement -- defined in the javafx.application.Platform class a method called runLater
which is functionally equivalent to invokeLater
. But they deliberately omitted an equivalent method to invokeAndWait
because it's very difficult to use properly.)
The reason is fairly straightforward to derive from first principles. Consider a system similar to the one described by the OP, having two threads: MyThread and the Event Dispatch Thread (EDT). MyThread takes a lock on object L and then calls invokeAndWait
. This posts event E1 and waits for it to be processed by the EDT. Suppose that E1's handler needs to lock L. When the EDT processes event E1, it attempts to take the lock on L. This lock is held already by MyThread, which won't relinquish it until the EDT processes E1, but that processing is blocked by MyThread. Thus we have deadlock.
Here's a variation on this scenario. Suppose we ensure that processing E1 doesn't require locking L. Will this be safe? No. The problem can still occur if, just before MyThread calls invokeAndWait
, an event E0 is posted to the event queue, and E0's handler requires locking on L. As before, MyThread holds the lock on L, so processing of E0 is blocked. E1 is behind E0 in the event queue so processing of E1 is blocked too. Since MyThread is waiting for E1 to be processed, and it's blocked by E0, which in turn is blocked waiting for MyThread to relinquish the lock on L, we have deadlock again.
This sounds fairly similar to what's going on in the OP's application. According to the OP's comments on this answer,
Yes, renderOnEDT is synchronized on something way up in the call stack, the com.acme.persistence.TransactionalSystemImpl.executeImpl method which is synchronized. And renderOnEDT is waiting to enter that same method. So, that is the source of the deadlock it looks like. Now I have to figure out how to fix it.
We don't have a complete picture, but this is probably enough to go on. renderOnEDT
is being called from MyThread, which is holding a lock on something while it's blocked in invokeAndWait
. It's waiting for an event to be processed by the EDT, but we can see the EDT is blocked on something held by MyThread. We can't quite tell exactly which object this is, but it kind of doesn't matter -- the EDT is clearly blocked on a lock held by MyThread, and MyThread is clearly waiting for the EDT to process an event: thus, deadlock.
Note also that we can be fairly sure the EDT isn't currently processing the event posted by invokeAndWait
(analogous to E1 in my scenario above). If it were, the deadlock would occur every time. It seems to occur only sometimes, and according to a comment from the OP on this answer, when the user is typing quickly. So I'd bet that the event currently being processed by the EDT is a keystroke that happened to be posted to the event queue after MyThread took its lock, but before MyThread called invokeAndWait
to post E1 to the event queue, thus it's analogous to E0 in my scenario above.
So far, this is probably mostly a recap of the problem, pieced together from other answers and from the OP's comments on those answers. Before we proceed to talking about a solution, here are some assumptions I'm making about the OP's application:
It's multi-threaded, so various objects must be synchronized to work properly. This includes calls from Swing event handlers, which presumably update some model based on user interaction, and this model is also processed by worker threads such as MyThread. Therefore, they must lock such objects properly. Removing synchronization will definitely avoid deadlocks, but other bugs will creep in as the data structures are corrupted by unsynchronized concurrent access.
The application isn't necessarily performing long-running operations on the EDT. This is a typical problem with GUI apps but it doesn't seem to be happening here. I'm assuming that the application works fine in most cases, where an event processed on the EDT grabs a lock, updates something, then releases the lock. The problem occurs when it can't get the lock because the lock's holder is deadlocked on the EDT.
Changing invokeAndWait
to invokeLater
isn't an option. The OP has said that doing so causes other problems. This isn't surprising, as that change causes execution to occur in a different order, so it will give different results. I'll assume they would be unacceptable.
If we can't remove locks, and we can't change to invokeLater
, we're left with calling invokeAndWait
safely. And "safely" means relinquishing locks before calling it. This might be arbitrarily hard to do given the organization of the OP's application, but I think it's the only way to proceed.
Let's look at what MyThread is doing. This is much simplified, as there are probably a bunch of intervening method calls on the stack, but fundamentally it's something like this:
synchronized (someObject) {
// code block 1
SwingUtilities.invokeAndWait(handler);
// code block 2
}
The problem occurs when some event sneaks in the queue in front of handler, and that event's processing requires locking someObject
. How can we avoid this problem? You can't relinquish one of Java's built-in monitor locks within a synchronized
block, so you have to close the block, make your call, and open it again:
synchronized (someObject) {
// code block 1
}
SwingUtilities.invokeAndWait(handler);
synchronized (someObject) {
// code block 2
}
This could be arbitrarily difficult if the lock on someObject
is taken fairly far up the call stack from the call to invokeAndWait
, but I think doing this refactoring is unavoidable.
There are other pitfalls, too. If code block 2 depends on some state loaded by code block 1, that state might be out of date by the time code block 2 takes the lock again. This implies that code block 2 must reload any state from the synchronized object. It mustn't make any assumptions based on results from code block 1, since those results might be out of date.
Here's another issue. Suppose the handler being run by invokeAndWait
requires some state loaded from the shared object, for example,
synchronized (someObject) {
// code block 1
SwingUtilities.invokeAndWait(handler(state1, state2));
// code block 2
}
You couldn't just migrate the invokeAndWait
call out of the synchronized block, since that would require unsynchronized access getting state1 and state2. What you have to do instead is to load this state into local variables while within the lock, then make the call using those locals after releasing the lock. Something like:
int localState1;
String localState2;
synchronized (someObject) {
// code block 1
localState1 = state1;
localState2 = state2;
}
SwingUtilities.invokeAndWait(handler(localState1, localState2));
synchronized (someObject) {
// code block 2
}
The technique of making calls after having released locks is called the open call technique. See Doug Lea, Concurrent Programming in Java (2nd edition), sec 2.4.1.3. There is also a good discussion of this technique in Goetz et. al., Java Concurrency In Practice, sec 10.1.4. In fact all of section 10.1 covers deadlock fairly thoroughly; I recommend it highly.
In summary, I believe that using techniques I describe above, or in the books cited, will solve this deadlock problem correctly and safely. However, I am sure that it will require a lot of careful analysis and difficult restructuring as well. I don't see an alternative, though.
(Finally, I should say that while I am an employee of Oracle, this is not in any way an official statement of Oracle.)
UPDATE
I thought of a couple more potential refactorings that might help solve the problem. Let's reconsider the original schema of the code:
synchronized (someObject) {
// code block 1
SwingUtilities.invokeAndWait(handler);
// code block 2
}
This executes code block 1, handler, and code block 2 in order. If we were to change the invokeAndWait
call to invokeLater
, the handler would be executed after code block 2. One can easily see that would be a problem for the application. Instead, how about we move code block 2 into the invokeAndWait
so that it executes in the right order, but still on the event thread?
synchronized (someObject) {
// code block 1
}
SwingUtilities.invokeAndWait(Runnable {
synchronized (someObject) {
handler();
// code block 2
}
});
Here's another approach. I don't know exactly what the handler passed to invokeAndWait
is intended to do. But one reason it might need to be invokeAndWait
is that it reads some information out of the GUI and then uses this to update the shared state. This has to be on the EDT, since it interacts with GUI objects, and invokeLater
can't be used since it would occur in the wrong order. This suggests calling invokeAndWait
before doing other processing in order to read information out of the GUI into a temporary area, then use this temporary area to perform continued processing:
TempState tempState;
SwingUtilities.invokeAndWait(Runnable() {
synchronized (someObject) {
handler();
tempState.update();
}
);
synchronized (someObject) {
// code block 1
// instead of invokeAndWait, use tempState from above
// code block 2
}
It's hard to tell without seeing the code, but from the stack trace, it looks like you're firing some sort of transactional code from the event dispatch thread. Does that code then kick off an instance of My-Thread? The EDT could be blocked waiting for My-Thread from within the transactional code, but My-Thread can't finish because it needs the EDT.
If this is the case, you can use SwingUtilities.invokeLater
for your rendering so the EDT finishes the transactional code and then it will render the updates. Or, you can not perform the transactional code from the EDT. For actual work that's not related to rendering, you should use a SwingWorker to avoid doing any heavy processing on the EDT.