1.ANR的定义
ANR(Application Not Responding):应用无响应
即主线程在特定的时间内没有完成特定的事情,就会产生ANR。
在Android当中有以下几种ANR的类型:
- KeyDispatchTimeout,input事件在5秒内没有处理完;
- ServiceTimeout,前台service在20秒内,后台service在200秒内没有处理完;
- BroadcastTimeout,BroadcastReceiver的onReceiver,前台广播在10秒内,后台广播在60秒内没有处理完;
- ProcessContentproviderPublishTimeoutLocked,ContentProvider publish在10秒内没有处理完;
2.各种场景产生ANR的原因
我们分别针对这几种场景来看看,系统是如何抛出ANR异常的
2.1 ServiceTimeout
首先我们来看下一个Service启动的流程,如下图所示
我们来看下ActiveServices.realStartServiceLocked的源码,
private final void realStartServiceLocked(ServiceRecord r,
ProcessRecord app, boolean execInFg) throws RemoteException {
if (app.thread == null) {
throw new RemoteException();
}
if (DEBUG_MU)
Slog.v(TAG_MU, "realStartServiceLocked, ServiceRecord.uid = " + r.appInfo.uid
+ ", ProcessRecord.uid = " + app.uid);
r.app = app;
r.restartTime = r.lastActivity = SystemClock.uptimeMillis();
final boolean newService = app.services.add(r);
//1.启动ANR监测
bumpServiceExecutingLocked(r, execInFg, "create");
mAm.updateLruProcessLocked(app, false, null);
updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
mAm.updateOomAdjLocked();
boolean created = false;
try {
if (LOG_SERVICE_START_STOP) {
String nameTerm;
int lastPeriod = r.shortName.lastIndexOf('.');
nameTerm = lastPeriod >= 0 ? r.shortName.substring(lastPeriod) : r.shortName;
EventLogTags.writeAmCreateService(
r.userId, System.identityHashCode(r), nameTerm, r.app.uid, r.app.pid);
}
synchronized (r.stats.getBatteryStats()) {
r.stats.startLaunchedLocked();
}
mAm.notifyPackageUse(r.serviceInfo.packageName,
PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
//2.启动service
app.thread.scheduleCreateService(r, r.serviceInfo,
mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
app.repProcState);
r.postNotification();
created = true;
} catch (DeadObjectException e) {
Slog.w(TAG, "Application dead when creating service " + r);
mAm.appDiedLocked(app);
throw e;
} finally {
if (!created) {
// Keep the executeNesting count accurate.
final boolean inDestroying = mDestroyingServices.contains(r);
serviceDoneExecutingLocked(r, inDestroying, inDestroying);
// Cleanup.
if (newService) {
app.services.remove(r);
r.app = null;
}
// Retry.
if (!inDestroying) {
scheduleServiceRestartLocked(r, false);
}
}
}
......
}
在上面的代码注释2处,启动了service,而在启动service之前,执行了函数bumpServiceExecutingLocked,这个函数会调用scheduleServiceTimeoutLocked,启动ANR的监测机制,具体如下所示
// 前台service timeout的时间
static final int SERVICE_TIMEOUT = 20*1000;
// 后台service Timeout的时间
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
if (proc.executingServices.size() == 0 || proc.thread == null) {
return;
}
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
mAm.mHandler.sendMessageDelayed(msg,
proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
}
可以很容易的看出,scheduleServiceTimeoutLocked其实就是往handler发送了一个延迟msg,如果是后台service,delay的时间就是200秒,如果是前台service,delay的时间就是20秒。
而当服务如果在限制的时间内就执行完成了,那么执行ActivityManagerService.serviceDoneExecuting移除这个msg。
public void serviceDoneExecuting(IBinder token, int type, int startId, int res) {
synchronized(this) {
if (!(token instanceof ServiceRecord)) {
Slog.e(TAG, "serviceDoneExecuting: Invalid service token=" + token);
throw new IllegalArgumentException("Invalid service token");
}
mServices.serviceDoneExecutingLocked((ServiceRecord)token, type, startId, res);
}
}
2.2 BroadcastTimeout
BroadcastTimeout的ANR原理和Service的差不多,我们来看下BroadcastQueue.processNextBroadcastLocked的源码,
final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {
......
do {
if (mOrderedBroadcasts.size() == 0) {
// No more broadcasts pending, so all done!
mService.scheduleAppGcsLocked();
if (looped) {
// If we had finished the last ordered broadcast, then
// make sure all processes have correct oom and sched
// adjustments.
mService.updateOomAdjLocked();
}
return;
}
r = mOrderedBroadcasts.get(0);
boolean forceReceive = false;
// Ensure that even if something goes awry with the timeout
// detection, we catch "hung" broadcasts here, discard them,
// and continue to make progress.
//
// This is only done if the system is ready so that PRE_BOOT_COMPLETED
// receivers don't get executed with timeouts. They're intended for
// one time heavy lifting after system upgrades and can take
// significant amounts of time.
int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
if (mService.mProcessesReady && r.dispatchTime > 0) {
long now = SystemClock.uptimeMillis();
if ((numReceivers > 0) &&
(now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
Slog.w(TAG, "Hung broadcast ["
+ mQueueName + "] discarded after timeout failure:"
+ " now=" + now
+ " dispatchTime=" + r.dispatchTime
+ " startTime=" + r.receiverTime
+ " intent=" + r.intent
+ " numReceivers=" + numReceivers
+ " nextReceiver=" + r.nextReceiver
+ " state=" + r.state);
//1.启动ANR监听机制
broadcastTimeoutLocked(false); // forcibly finish this broadcast
forceReceive = true;
r.state = BroadcastRecord.IDLE;
}
}
if (r.state != BroadcastRecord.IDLE) {
if (DEBUG_BROADCAST) Slog.d(TAG_BROADCAST,
"processNextBroadcast("
+ mQueueName + ") called when not idle (state="
+ r.state + ")");
return;
}
if (r.receivers == null || r.nextReceiver >= numReceivers
|| r.resultAbort || forceReceive) {
// No more receivers for this broadcast! Send the final
// result if requested...
if (r.resultTo != null) {
try {
if (DEBUG_BROADCAST) Slog.i(TAG_BROADCAST,
"Finishing broadcast [" + mQueueName + "] "
+ r.intent.getAction() + " app=" + r.callerApp);
//2.处理广播消息
performReceiveLocked(r.callerApp, r.resultTo,
new Intent(r.intent), r.resultCode,
r.resultData, r.resultExtras, false, false, r.userId);
// Set this to null so that the reference
// (local and remote) isn't kept in the mBroadcastHistory.
r.resultTo = null;
} catch (RemoteException e) {
r.resultTo = null;
Slog.w(TAG, "Failure ["
+ mQueueName + "] sending broadcast result of "
+ r.intent, e);
}
}
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Cancelling BROADCAST_TIMEOUT_MSG");
cancelBroadcastTimeoutLocked();
if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST,
"Finished with ordered broadcast " + r);
// ... and on to the next...
addBroadcastToHistoryLocked(r);
if (r.intent.getComponent() == null && r.intent.getPackage() == null
&& (r.intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY) == 0) {
// This was an implicit broadcast... let's record it for posterity.
mService.addBroadcastStatLocked(r.intent.getAction(), r.callerPackage,
r.manifestCount, r.manifestSkipCount, r.finishTime-r.dispatchTime);
}
mOrderedBroadcasts.remove(0);
r = null;
looped = true;
continue;
}
} while (r == null);
......
}
我们可以看到在上面的代码注释1处启动了ANR的监听机制,在注释2处处理广播信息。在注释1处,broadcastTimeoutLocked最终会调用broadcastTimeoutLocked,如下面的代码所示,而timeout的时间就等于接收到广播的时间+mTimeoutPeriod。
final void broadcastTimeoutLocked(boolean fromMsg) {
......
if (fromMsg) {
if (!mService.mProcessesReady) {
// Only process broadcast timeouts if the system is ready. That way
// PRE_BOOT_COMPLETED broadcasts can't timeout as they are intended
// to do heavy lifting for system up.
return;
}
long timeoutTime = r.receiverTime + mTimeoutPeriod;
if (timeoutTime > now) {
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
"Premature timeout ["
+ mQueueName + "] @ " + now + ": resetting BROADCAST_TIMEOUT_MSG for "
+ timeoutTime);
//发送延迟消息
setBroadcastTimeoutLocked(timeoutTime);
return;
}
}
......
}
final void setBroadcastTimeoutLocked(long timeoutTime) {
if (! mPendingBroadcastTimeoutMessage) {
Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
mHandler.sendMessageAtTime(msg, timeoutTime);
mPendingBroadcastTimeoutMessage = true;
}
}
mTimePeriod是在初始化的时候传入的,如下所示。可以很清楚的看到前台BroadcastReceiver的Timeout的时间是10秒,后台BroadcastReceiver的时间是60秒
//BroadcastQueue代码
BroadcastQueue(ActivityManagerService service, Handler handler,
String name, long timeoutPeriod, boolean allowDelayBehindServices) {
mService = service;
mHandler = new BroadcastHandler(handler.getLooper());
mQueueName = name;
mTimeoutPeriod = timeoutPeriod;
mDelayBehindServices = allowDelayBehindServices;
}
//ActivityManagerService代码
static final int BROADCAST_FG_TIMEOUT = 10*1000;
static final int BROADCAST_BG_TIMEOUT = 60*1000;
mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
"foreground", BROADCAST_FG_TIMEOUT, false);
mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
"background", BROADCAST_BG_TIMEOUT, true);
2.3 ProcessContentproviderPublishTimeoutLocked
在APP进程启动的时,ActivityThread的main函数会执行attach,而attach函数会调用ActivityManagerService.attachApplication,进而执行attachApplicationLocked函数,我们可以看到在注释1处发送了延迟msg,启动了anr监听机制,而CONTENT_PROVIDER_PUBLISH_TIMEOUT的时间是10秒。
static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;
private final boolean attachApplicationLocked(IApplicationThread thread,
int pid, int callingUid, long startSeq) {
......
mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);
boolean normalMode = mProcessesReady || isAllowedWhileBooting(app.info);
List<ProviderInfo> providers = normalMode ? generateApplicationProvidersLocked(app) : null;
if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
msg.obj = app;
//1.启动anr监听机制
mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
}
checkTime(startTime, "attachApplicationLocked: before bindApplication");
......
}
而当ContentProvider成功发布之后,ActivityManagerService会remove掉这个msg,如下面代码注释1处所示
public final void publishContentProviders(IApplicationThread caller,
List<ContentProviderHolder> providers) {
if (providers == null) {
return;
}
enforceNotIsolatedCaller("publishContentProviders");
synchronized (this) {
......
for (int i = 0; i < N; i++) {
ContentProviderHolder src = providers.get(i);
if (src == null || src.info == null || src.provider == null) {
continue;
}
ContentProviderRecord dst = r.pubProviders.get(src.info.name);
if (DEBUG_MU) Slog.v(TAG_MU, "ContentProviderRecord uid = " + dst.uid);
if (dst != null) {
ComponentName comp = new ComponentName(dst.info.packageName, dst.info.name);
mProviderMap.putProviderByClass(comp, dst);
String names[] = dst.info.authority.split(";");
for (int j = 0; j < names.length; j++) {
mProviderMap.putProviderByName(names[j], dst);
}
int launchingCount = mLaunchingProviders.size();
int j;
boolean wasInLaunchingProviders = false;
for (j = 0; j < launchingCount; j++) {
if (mLaunchingProviders.get(j) == dst) {
mLaunchingProviders.remove(j);
wasInLaunchingProviders = true;
j--;
launchingCount--;
}
}
if (wasInLaunchingProviders) {
//1.移除Timeout的msg
mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
}
synchronized (dst) {
dst.provider = src.provider;
dst.proc = r;
dst.notifyAll();
}
updateOomAdjLocked(r, true);
maybeUpdateProviderUsageStatsLocked(r, src.info.packageName,
src.info.authority);
}
}
Binder.restoreCallingIdentity(origId);
}
}
2.4 KeyDispatchTimeout
InpuntManagerService会起两个线程InputReader线程和InputDispatcherThread,InputDispatcherThread线程会由reader线程wake,起来后就threadloop不断循环读取input事件。
具体流程如下
图片来源于https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation
这里我们主要看一下handleTargetsNotReadyLocked函数,
如下注释1处,当inputdispatcher收到一个事件之后,会将mInputTargetWaitTimeoutTime赋值为当前时间+timeout,然后当下一个input事件来临之时,用此时的当前时间和mInputTargetWaitTimeoutTime进行比对(注释2处),如果当前时间大于mInputTargetWaitTimeoutTime则触发ANR。
//默认5秒超时
constexpr nsecs_t DEFAULT_INPUT_DISPATCHING_TIMEOUT = 5000 * 1000000LL; // 5 sec
int32_t InputDispatcher::handleTargetsNotReadyLocked(
nsecs_t currentTime, const EventEntry* entry,
const sp<InputApplicationHandle>& applicationHandle,
const sp<InputWindowHandle>& windowHandle, nsecs_t* nextWakeupTime, const char* reason) {
if (applicationHandle == nullptr && windowHandle == nullptr) {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for system to become ready for input. Reason: %s", reason);
#endif
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY;
mInputTargetWaitStartTime = currentTime;
mInputTargetWaitTimeoutTime = LONG_LONG_MAX;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationToken.clear();
}
} else {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for application to become ready for input: %s. Reason: %s",
getApplicationWindowLabel(applicationHandle, windowHandle).c_str(), reason);
#endif
nsecs_t timeout;
if (windowHandle != nullptr) {
timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else if (applicationHandle != nullptr) {
timeout =
applicationHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else {
timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;
}
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
mInputTargetWaitStartTime = currentTime;
//1.当前时间+timeout
mInputTargetWaitTimeoutTime = currentTime + timeout;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationToken.clear();
if (windowHandle != nullptr) {
mInputTargetWaitApplicationToken = windowHandle->getApplicationToken();
}
if (mInputTargetWaitApplicationToken == nullptr && applicationHandle != nullptr) {
mInputTargetWaitApplicationToken = applicationHandle->getApplicationToken();
}
}
}
if (mInputTargetWaitTimeoutExpired) {
return INPUT_EVENT_INJECTION_TIMED_OUT;
}
//2.触发anr
if (currentTime >= mInputTargetWaitTimeoutTime) {
onANRLocked(currentTime, applicationHandle, windowHandle, entry->eventTime,
mInputTargetWaitStartTime, reason);
// Force poll loop to wake up immediately on next iteration once we get the
// ANR response back from the policy.
*nextWakeupTime = LONG_LONG_MIN;
return INPUT_EVENT_INJECTION_PENDING;
} else {
// Force poll loop to wake up when timeout is due.
if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
*nextWakeupTime = mInputTargetWaitTimeoutTime;
}
return INPUT_EVENT_INJECTION_PENDING;
}
}
3.总结
对比这几种不同场景的anr,其实原理都是一样,就是在执行之前先埋一颗定时炸弹,如果能够在规定的时间之内执行完,那么就可以及时的拆除炸弹,否则就会引爆炸弹,产生anr。
参考:https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation
来源:CSDN
作者:我是黄大仙
链接:https://blog.csdn.net/hbdatouerzi/article/details/103934294