Android performance optimization series: crash causes and capture
Basic causes of crashes
Crash analysis caused by throwing exceptions
Crash is a very common situation we encounter in daily development. It may be NullPointerException, IllegalArgumentException, etc. When the application throws these exceptions that we do not catch, it is followed by the application crash, the process is killed and quit.
Maybe you have always thought that the process was killed and exited because an exception was thrown. You think that throwing an uncaught exception is the root cause of the process exit. But is this really the case?
The main thread is also a thread, so we look for any traces from the Thread source code that can explain:
Thread.java
/**
* Dispatch an uncaught exception to the handler. This method is
* intended to be called only by the JVM.
*/
private void dispatchUncaughtException(Throwable e) {
getUncaughtExceptionHandler().uncaughtException(this, e);
}
public UncaughtExceptionHandler getUncaughtExceptionHandler() {
return uncaughtExceptionHandler != null ?
uncaughtExceptionHandler : group;
}
public static UncaughtExceptionHandler getDefaultUncaughtExceptionHandler(){
return defaultUncaughtExceptionHandler;
}
ThreadGroup.java
public void uncaughtException(Thread t, Throwable e) {
if (parent != null) {
parent.uncaughtException(t, e);
} else {
Thread.UncaughtExceptionHandler ueh =
Thread.getDefaultUncaughtExceptionHandler();
if (ueh != null) {
ueh.uncaughtException(t, e);
} else if (!(e instanceof ThreadDeath)) {
System.err.print("Exception in thread \""
+ t.getName() + "\" ");
e.printStackTrace(System.err);
}
}
}
The above source code is the code that Thread will execute when handling uncaught exceptions.dispatchUncaughtException() is called by the JVM when the exception is not caught.
When the program code runs incorrectly and throws an exception, if Thread.UncaughtExceptionHandler is not set,there is no code related to process exit from Thread. So why does our program throw an exception and cause the process to exit?
The app process created in Android is the zygote fork process, and before the zygote process is created, it is started by the init process. The entry point of each application is the main() function:
RuntimeInit.java
public static void main(String[] argv) {
...
commonInit();
...
}
protected static final void commonInit() {
...
LoggingHandler loggingHandler = new LoggingHandler();
Thread.setUncaughtExceptionPreHandler(loggingHandler);
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
...
}
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
...
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
...
} finally {
Process.killProcess(Process.myPid());
System.exit(10);
}
}
}
It can be found that the application will exit if an exception is thrown. In fact, RuntimeInit sets the KillApplicationHandler on the main thread, and actively kills and exits the application process when an exception is caught.
So here we can conclude that throwing uncaught exceptions is not the root cause of the process killing and exiting. Android provides by default the logic to actively kill and exit the application process when the exception is not caught.
A brief summary of the java crash uncaught exception handling process:
-
When an uncaught exception is encountered in Thread, dispatchUncaughtException() will be called by the JVM.
-
dispatchUncaughtException() internally uses Thread.UncaughtExceptionHandler for processing
-
Android provides a KillApplicationHandler by default that inherits Thread.UncaughtExceptionHandler, which provides process exit function
That is, when the code in our program throws an uncaught exception, the exception will be thrown all the way up until it is handled by the JVM. The JVM will call Thread's dispatchUncaughtException(), and then hand it over to the KillApplicationHandler set by Android when creating the application to handle the process. quit.
How does AMS handle application exception information reporting?
In the above analysis, when RuntimeInit sets the KillApplicationHandler, the application crash information is given to AMS before killing the process:
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
...
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
...
} finally {
Process.killProcess(Process.myPid());
System.exit(10);
}
}
}
Then what exactly does AMS do before the process exits? Let’s look at the source code:
ActivityManagerService.java
public void handleApplicationCrash(IBinder app,ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
ProcessRecord r = findAppProcess(app, "Crash");
final String processName = app == null ? "system_server"
: (r == null ? "unknown" : r.processName);
handleApplicationCrashInner("crash", r, processName, crashInfo);
}
void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
ApplicationErrorReport.CrashInfo crashInfo) {
...
addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo);
...
}
It can be found that after obtaining the application process information, AMS finally called addErrorToDropBox() to provide the error information to DropBox.
Android DropBox is a mechanism introduced in Android 8 to continuously store system data. It is mainly used to record logs when serious problems occur in the kernel, system processes, user processes, etc. during the running of Android. This can be considered as a system-level logcat that can be continuously stored . The stored log information is stored in /data/system/dropbox/ (this directory cannot be accessed without root), and crash and anr logs will be stored here.
The eventType parameter represents the recorded log information type, which has three types:
-
Uncaught java exception: crash
-
ANR exception: anr
-
native exception: native crash
How to deal with native crash system
The above mentioned how the app captures java crash, but how does the native crash capture it?
ActivityManagerService.java
public void startObservingNativeCrashes() {
final NativeCrashListener ncl = new NativeCrashListener(this);
ncl.start();
}
AMS's startObservingNativeCrashes() is called when SystemServer is created and started:
SystemServer.java
private void startOtherServices() {
...
mActivityManagerService.startObservingNativeCrashes();
...
}
startOtherService() is a method that is called immediately after some core services are started to start other services. Native crash monitoring is started at this time.
Next, let’s take a look at how native crash is monitored.
NativeCrashListener.java
static final String DEBUGGERD_SOCKET_PATH = "/data/system/ndebugsocket";
@Override
public void run() {
...
FileDescriptor serverFd = Os.socket(AF_UNIX, SOCK_STREAM, 0);
final UnixSocketAddress sockAddr = UnixSocketAddress.createFileSystem(
DEBUGGERD_SOCKET_PATH);
Os.bind(serverFd, sockAddr);
Os.listen(serverFd, 1);
Os.chmod(DEBUGGERD_SOCKET_PATH, 0777);
while (true) {
FileDescriptor peerFd = null;
try {
...
peerFd = Os.accept(serverFd, null /* peerAddress */);
...
if (peerFd != null) {
consumeNativeCrashData(peerFd);
}
...
}
}
...
}
void consumeNativeCrashData(FileDescriptor fd) {
...
final String reportString = new String(os.toByteArray(), "UTF-8");
(new NativeCrashReport(pr, signal, reportString)).start();
}
class NativeCrashReport extends Thread {
ProcessRecord mApp;
int mSignal;
String mCrashReport;
NativeCrashReport(ProcessRecord app, int signal, String report) {
super("NativeCrashReport");
mApp = app;
mSignal = signal;
mCrashReport = report;
}
@Override
public void run() {
try {
CrashInfo ci = new CrashInfo();
ci.exceptionClassName = "Native crash";
ci.exceptionMessage = Os.strsignal(mSignal);
ci.throwFileName = "unknown";
ci.throwClassName = "unknown";
ci.throwMethodName = "unknown";
ci.stackTrace = mCrashReport;
if (DEBUG) Slog.v(TAG, "Calling handleApplicationCrash()");
mAm.handleApplicationCrashInner("native_crash", mApp, mApp.processName, ci);
if (DEBUG) Slog.v(TAG, "<-- handleApplicationCrash() returned");
} catch (Exception e) {
Slog.e(TAG, "Unable to report native crash", e);
}
...
}
}
From the source code, we can know that native crash monitoring monitors an fd file descriptor through a semaphore and mounts it. When fd returns data, it means that a native crash has occurred. At this time, the byte data of the native crash can be obtained through fd. After encapsulation, the data result is finally sent to AMS and addErrorToDropBox() is called, but the eventType is passed as native_crash.
A brief summary of the native crash exception handling process:
-
The communication mechanism is established and monitors an fd file descriptor.
-
Receive data, mount and wait
-
The fd file descriptor returns data and processes the data
-
Encapsulate the data and send the result to AMS. Call addErrorToDropBox() to write the storage information.
How the system handles ANR exception data
Because here we only talk about how the ANR exception is ultimately handled, so for the reasons and specific processes for the occurrence of the ANR exception, you can view the ANR triggering principle and analysis. We will not go into details here. We directly locate the location where the ANR has occurred:
AppErrors.java
final void appNotResponding(ProcessRecord app, ActivityRecord activity,ActivityRecord parent, boolean aboveSystem, final String annotation) {
...
File tracesFile = ActivityManagerService.dumpStackTraces(
true, firstPids,
(isSilentANR) ? null : processCpuTracker,
(isSilentANR) ? null : lastPids,
nativePids);
...
mService.addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
cpuInfo, tracesFile, null);
...
}
The ANR exception information is the called appNotResponding() of AppErrors, which encapsulates the data processing and then sends it to AMS to call addErrorToDropBox(). The eventType passed in is anr.
A brief summary of the anr exception handling process:
-
Record anr corresponding data to SLOG (framework log system)
-
Record anr data to LOG (in the main log system)
-
Dump specific data into the specified file
-
Call addErrorToDropBox() of AMS to write storage information
addErrorToDropBox()
After the above analysis, we know that whether it is java crash, native crash or anr, it will eventually be handled by AMS's addErrorToDropBox(). So how is it handled?
ActivityManagerService.java
public void addErrorToDropBox(String eventType,
ProcessRecord process, String processName, ActivityRecord activity,
ActivityRecord parent, String subject,
final String report, final File dataFile,
final ApplicationErrorReport.CrashInfo crashInfo) {
...
final StringBuilder sb = new StringBuilder(1024);
appendDropBoxProcessHeaders(process, processName, sb);
if (process != null) {
sb.append("Foreground: ")
.append(process.isInterestingToUserLocked() ? "Yes" : "No")
.append("\n");
}
if (activity != null) {
sb.append("Activity: ").append(activity.shortComponentName).append("\n");
}
if (parent != null && parent.app != null && parent.app.pid != process.pid) {
sb.append("Parent-Process: ").append(parent.app.processName).append("\n");
}
if (parent != null && parent != activity) {
sb.append("Parent-Activity: ").append(parent.shortComponentName).append("\n");
}
if (subject != null) {
sb.append("Subject: ").append(subject).append("\n");
}
sb.append("Build: ").append(Build.FINGERPRINT).append("\n");
if (Debug.isDebuggerConnected()) {
sb.append("Debugger: Connected\n");
}
sb.append("\n");
...
if (lines > 0) {
sb.append("\n");
// Merge several logcat streams, and take the last N lines
InputStreamReader input = null;
try {
java.lang.Process logcat = new ProcessBuilder(
"/system/bin/timeout", "-k", "15s", "10s",
"/system/bin/logcat", "-v", "threadtime", "-b", "events", "-b", "system",
"-b", "main", "-b", "crash", "-t", String.valueOf(lines))
.redirectErrorStream(true).start();
try { logcat.getOutputStream().close(); } catch (IOException e) {}
try { logcat.getErrorStream().close(); } catch (IOException e) {}
input = new InputStreamReader(logcat.getInputStream());
int num;
char[] buf = new char[8192];
while ((num = input.read(buf)) > 0) sb.append(buf, 0, num);
} catch (IOException e) {
Slog.e(TAG, "Error running logcat", e);
} finally {
if (input != null) try { input.close(); } catch (IOException e) {}
}
}
dbox.addText(dropboxTag, sb.toString());
...
}
DropBoxManager.java
public void addText(String tag, String data) {
try {
mService.add(new Entry(tag, 0, data));
} catch (RemoteException e) {
if (e instanceof TransactionTooLargeException
&& mContext.getApplicationInfo().targetSdkVersion < Build.VERSION_CODES.N) {
Log.e(TAG, "App sent too much data, so it was ignored", e);
return;
}
throw e.rethrowFromSystemServer();
}
}
When an exception message occurs, the final exception message is handed over to DropBoxManager (specifically DropBoxManagerService) to write the information to the specified directory file through IO.
Summarize
The above mainly analyzes what Android does for us when java crash, native crash and anr occur. Here is a brief summary:
-
Java crash is triggered and processed by JVM, and finally goes to the /data/system/dropbox directory and saves it as a file
-
native crash establishes a socket through pipeline communication to receive notifications, and finally goes to the /data/system/dropbox directory to save it as a file
-
anr is processed by triggers from various situations (events, front and backend services), and finally goes to the /data/system/dropbox directory and is saved as a file.
All crash processing will collect corresponding data in the /data/system/dropbox directory within the Android system.
At the same time, we also sorted out Android’s crash processing mechanism:
When Java does not catch an exception, the JVM calls dispatchUncaughtException and calls an UncaughtExceptionHandler to handle it. By default, RuntimeInit provides a KillApplicationHandler to exit the process directly.
If we want to handle exceptions ourselves, we can customize an UncaughtExceptionHandler to intercept and handle them.