问题
I've got a app which uses a surfaceview for displaying the UI. The app runs stable for about 18k users but there are 3 devices that get an ANR when returning to the surfaceview activity (sv activity->regular activity->back to sv activity).
The 3 devices are:
- Onda Tablet (Allwinner A31 Chipset)
- Sero 8 (Rockchip chipset)
- Acer 10" Tablet with Intel Atom
I tried to recreate the ANR but failed. According to my users the app runs fine for hours without any problems except for the devices listed above.
ANR Stacktrace from Android 4.2:
DALVIK THREADS:
(mutexes: tll=0 tsl=0 tscl=0 ghl=0)
"main" prio=5 tid=1 WAIT
| group="main" sCount=1 dsCount=0 obj=0x41c899a0 self=0x41a6c010
| sysTid=5497 nice=0 sched=0/0 cgrp=apps handle=1074877404
| state=S schedstat=( 0 0 0 ) utm=541 stm=129 core=2
at java.lang.Object.wait(Native Method)
- waiting on <0x41c89da0> (a java.lang.VMThread) held by tid=1 (main)
at java.lang.Thread.parkFor(Thread.java:1231)
at sun.misc.Unsafe.park(Unsafe.java:323)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:159)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:810)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:843)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1173)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:183)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:259)
at android.view.SurfaceView.updateWindow(SurfaceView.java:597)
at android.view.SurfaceView.onWindowVisibilityChanged(SurfaceView.java:329)
at android.view.View.dispatchWindowVisibilityChanged(View.java:7544)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1039)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1039)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1039)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1039)
at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1224)
at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1002)
at android.view.ViewRootImpl$TraversalRunnable.run(ViewRootImpl.java:4400)
at android.view.Choreographer$CallbackRecord.run(Choreographer.java:749)
at android.view.Choreographer.doCallbacks(Choreographer.java:562)
at android.view.Choreographer.doFrame(Choreographer.java:532)
at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:735)
at android.os.Handler.handleCallback(Handler.java:725)
at android.os.Handler.dispatchMessage(Handler.java:92)
at android.os.Looper.loop(Looper.java:137)
at android.app.ActivityThread.main(ActivityThread.java:5041)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:817)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:584)
at dalvik.system.NativeStart.main(Native Method)
"SurfaceDraw" prio=5 tid=15 SUSPENDED
| group="main" sCount=1 dsCount=0 obj=0x42867d30 self=0x69a65b38
| sysTid=12101 nice=0 sched=0/0 cgrp=apps handle=1753253360
| state=S schedstat=( 0 0 0 ) utm=7914 stm=23 core=0
at android.graphics.Canvas.native_drawARGB(Native Method)
at android.graphics.Canvas.drawARGB(Canvas.java:801)
at com.davidgiga1993.mixingstationlibrary.surface.BaseSurface.b(BaseSurface.java:167)
at com.davidgiga1993.mixingstationlibrary.surface.k.run(DrawThread.java:27)
"AsyncTask #3" prio=5 tid=14 WAIT
| group="main" sCount=1 dsCount=0 obj=0x421734d0 self=0x690d59d8
| sysTid=5820 nice=0 sched=0/0 cgrp=apps handle=1762495280
| state=S schedstat=( 0 0 0 ) utm=0 stm=0 core=0
at java.lang.Object.wait(Native Method)
- waiting on <0x421735f0> (a java.lang.VMThread) held by tid=14 (AsyncTask #3)
at java.lang.Thread.parkFor(Thread.java:1231)
at sun.misc.Unsafe.park(Unsafe.java:323)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:159)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2019)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:413)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1013)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:573)
at java.lang.Thread.run(Thread.java:856)
"Binder_3" prio=5 tid=13 NATIVE
| group="main" sCount=1 dsCount=0 obj=0x4215a380 self=0x684f9540
| sysTid=5691 nice=0 sched=0/0 cgrp=apps handle=1693280728
| state=S schedstat=( 0 0 0 ) utm=3 stm=0 core=3
#00 pc 00016fe4 /system/lib/libc.so (__ioctl+8)
#01 pc 0002a75d /system/lib/libc.so (ioctl+16)
#02 pc 00016ba1 /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+132)
#03 pc 00017363 /system/lib/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+154)
#04 pc 0001b15d /system/lib/libbinder.so
#05 pc 00011267 /system/lib/libutils.so (android::Thread::_threadLoop(void*)+114)
#06 pc 00046887 /system/lib/libandroid_runtime.so (android::AndroidRuntime::javaThreadShell(void*)+66)
#07 pc 00010dcd /system/lib/libutils.so
#08 pc 0000e3d8 /system/lib/libc.so (__thread_entry+72)
#09 pc 0000dac4 /system/lib/libc.so (pthread_create+160)
at dalvik.system.NativeStart.run(Native Method)
"AsyncTask #2" prio=5 tid=12 WAIT
| group="main" sCount=1 dsCount=0 obj=0x42145450 self=0x64f1aac8
| sysTid=5525 nice=0 sched=0/0 cgrp=apps handle=1693560600
| state=S schedstat=( 0 0 0 ) utm=0 stm=0 core=3
at java.lang.Object.wait(Native Method)
- waiting on <0x421455c8> (a java.lang.VMThread) held by tid=12 (AsyncTask #2)
at java.lang.Thread.parkFor(Thread.java:1231)
at sun.misc.Unsafe.park(Unsafe.java:323)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:159)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2019)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:413)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1013)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:573)
at java.lang.Thread.run(Thread.java:856)
"AsyncTask #1" prio=5 tid=11 WAIT
| group="main" sCount=1 dsCount=0 obj=0x42141d40 self=0x64ef35e8
| sysTid=5524 nice=0 sched=0/0 cgrp=apps handle=1693143224
| state=S schedstat=( 0 0 0 ) utm=0 stm=0 core=2
at java.lang.Object.wait(Native Method)
- waiting on <0x42141ed8> (a java.lang.VMThread) held by tid=11 (AsyncTask #1)
at java.lang.Thread.parkFor(Thread.java:1231)
at sun.misc.Unsafe.park(Unsafe.java:323)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:159)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2019)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:413)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1013)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:573)
at java.lang.Thread.run(Thread.java:856)
According to my interpretation the ANR occures before surfaceCreated
gets called.
Here is the sourcecode of the surfaceview and the drawing thread:
public class BaseSurface extends SurfaceView implements SurfaceHolder.Callback
{
protected SurfaceHolder holder;
private DrawThread drawThread;
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height)
{
Log.d("Surface", "Changed");
}
@Override
public void surfaceCreated(SurfaceHolder holder)
{
synchronized (holder)
{
this.holder = holder;
if (drawThread != null)
{
drawThread.Active = false;
try
{
drawThread.join();
}
catch (InterruptedException e)
{
}
}
drawThread = new DrawThread(this);
drawThread.Active = true;
drawThread.start();
}
}
@Override
public void surfaceDestroyed(SurfaceHolder holder)
{
synchronized (holder)
{
drawThread.Active = false;
boolean retry = true;
while (retry)
{
try
{
drawThread.join();
retry = false;
}
catch (InterruptedException e)
{
}
}
drawThread = null;
this.holder = null;
}
}
public void Update()
{
if (holder == null)
return;
Canvas canvas = holder.lockCanvas();
if (canvas != null)
{
synchronized (holder)
{
//drawing the ui...
holder.unlockCanvasAndPost(canvas);
}
}
}
}
public class DrawThread extends Thread
{
public boolean Active = false;
private BaseSurface surface;
private long frameStartTime;
public float FPS = 38f;// = 26fps; 27f = 37fps
private int sleepTime;
public DrawThread(BaseSurface surface)
{
super("SurfaceDraw");
this.surface = surface;
}
@Override
public void run()
{
while (Active)
{
frameStartTime = SystemClock.uptimeMillis();
surface.Update();
try
{
sleepTime = (int) (FPS - (SystemClock.uptimeMillis() - frameStartTime));
if (sleepTime > 0 && sleepTime < 1000)
{
Thread.sleep(sleepTime);
}
}
catch (InterruptedException ex)
{
Log.d("DrawThread", "Interrupred");
}
}
Log.d("DrawThread", "Finished");
}
}
I already searched a lot in the last days but didn't found any clue why this is happening. The only similar problem I found was here: https://groups.google.com/forum/#!msg/android-developers/0VuqnrYe7b0/Yw1mHodmrwoJ but he didn't posted any solution and his problem was not related to a specific device.
Did anyone else had these problems before with some specific devices and know a solution for this problem?
Edit:
I found a way to reproduce the ANR (randomly). The real problem that causes the ANR occures when the surface activity gets closed. Here is a stacktrace of a "good" close and a "bad" close:
Good
04-24 14:54:10.798: D/DrawThread(1526): Finished
04-24 14:54:10.798: D/Surface(1526): surfaceDestroyed
Bad
04-24 14:54:16.851: D/DrawThread(1526): Finished
04-24 14:54:16.851: D/Surface(1526): surfaceDestroyed
04-24 14:54:16.860: E/SurfaceHolder(1526): Exception locking surface
04-24 14:54:16.860: E/SurfaceHolder(1526): java.lang.IllegalArgumentException
04-24 14:54:16.860: E/SurfaceHolder(1526): at android.view.Surface.lockCanvasNative(Native Method)
04-24 14:54:16.860: E/SurfaceHolder(1526): at android.view.Surface.lockCanvas(Surface.java:76)
04-24 14:54:16.860: E/SurfaceHolder(1526): at android.view.SurfaceView$4.internalLockCanvas(SurfaceView.java:744)
04-24 14:54:16.860: E/SurfaceHolder(1526): at android.view.SurfaceView$4.lockCanvas(SurfaceView.java:720)
04-24 14:54:16.860: E/SurfaceHolder(1526): at com.davidgiga1993.mixingstationlibrary.surface.BaseSurface.Update(BaseSurface.java:169)
04-24 14:54:16.860: E/SurfaceHolder(1526): at com.davidgiga1993.mixingstationlibrary.surface.DrawThread.run(DrawThread.java:27)
04-24 14:54:16.860: D/Surface(1526): surfaceCreated
Why does a exception get called? The draw thread is stopped before the surface is destroyed and nothing is touching the surfaceview anymore. Also why gets the surfaceCreated
of the surface called after that exception?
The activity is not even anymore visible at that point.
I also tried removing all synchronized blocks but they didn't changed the behavior.
回答1:
Looking at the ANR trace in the android-developers link, they're running Android 4.2 and their main thread is stalling when the SurfaceView tries to lock its Surface. I believe the problem there is that the render thread has called lockCanvas()
, which locks the Surface (using a ReentrantLock), and then something happened that caused the SurfaceView to need to update (e.g. its size or position changed). You can see in the trace for the thread that (presumably) called lockCanvas()
that it's actively running ("Thread-3899" is in NATIVE with state=R
) in some complicated-looking bit of Skia code. So either the Skia code is looping forever, or is just taking a really long time to finish.
In your case, the render thread (SurfaceDraw
) is suspended, possibly because it finished what it was doing and was returning from native code to the VM. Yours was a simple drawARGB()
call so I'm not sure why it would take so long. It's possible that something else stalled it, and this is just where it happened to be when the ANR snapshot mechanism finally caught up.
It might be wise to grab your lock on the SurfaceHolder before you call lockCanvas() to ensure that you don't block waiting for it with the Canvas lock held.
(FWIW, synchronizing on a SurfaceHolder instance makes me a little nervous, since you can't know if something in SurfaceView is going to lock it for its own nefarious purposes. Don't think that's the problem here though.)
来源:https://stackoverflow.com/questions/23225993/anr-internal-function-on-some-devices