1. Java Backtrace
从Java Backtrace, 我们可以知道当时Process 的虚拟机执行状态. Java Backtrace 依靠SignalCatcher 来抓取.
Google default: SignalCatcher catchs SIGQUIT(3), and then print the java backtrace to /data/anr/trace.txt
MTK Enhance: SignalCatcher catchs SIGSTKFLT(16), and then print the java backtrace to /data/anr/mtktrace.txt( After 6577.SP/ICS2.MP)
You can update system properties dalvik.vm.stack-trace-file to Change the address, default is /data/anr/traces.txt
1.1 抓取的方式
* 在ENG Build 中
adb remount
adb shell chmod 0777 data/anr
adb shell kill -3 pid
adb pull /data/anr
* 在User Build 中
没有root 权限的情况下,只能直接pull 已经存在的backtrace.
adb pull /data/anr
* 你可以尝试直接使用下面的脚本一次性抓取
adb remount
adb shell chmod 0777 data/anr
adb shell ps
@echo off
set processid=
set /p processid=Please Input process id:
@echo on
adb shell kill -3 %processid%
@echo off
ping -n 8 127.0.0.1>nul
@echo on
adb pull data/anr/traces.txt trace-%processid%.txt
pause
1.2 JavaBacktrace 解析
下面是一小段system server 的java backtrace 的开始
----- pid 682 at 2014-07-30 18:04:53 -----
Cmd line: system_server
JNI: CheckJNI is off; workarounds are off; pins=4; globals=1484 (plus 50 weak)
DALVIK THREADS:
(mutexes: tll=0 tsl=0 tscl=0 ghl=0)
"main" prio=5 tid=1 NATIVE
| group="main" sCount=1 dsCount=0 obj=0x4193fde0 self=0x418538f8
| sysTid=682 nice=-2 sched=0/0 cgrp=apps handle=1074835940
| state=S schedstat=( 47858718206 26265263191 44902 ) utm=4074 stm=711 core=0
at android.os.MessageQueue.nativePollOnce(Native Method)
at android.os.MessageQueue.next(MessageQueue.java:138)
at android.os.Looper.loop(Looper.java:150)
at com.android.server.ServerThread.initAndLoop(SystemServer.java:1468)
at com.android.server.SystemServer.main(SystemServer.java:1563)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:515)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:829)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:645)
at dalvik.system.NativeStart.main(Native Method)
我们一行一行来解析.
* 0# 最开始是 -----PID at Time 然后接着 Cmd line: process name
* 1# the backtrace header: dvm thread :“DALVIK THREADS:”
* 2# Global DVM mutex value: if 0 unlock, else lock
tll: threadListLock,
tsl: threadSuspendLock,
tscl: threadSuspendCountLock
ghl: gcHeapLock
(mutexes: tll=0 tsl=0 tscl=0 ghl=0)
* 3# thread name, java thread Priority, [daemon], DVM thread id, DVM thread status.
"main" -> main thread -> activity thread
prio -> java thread priority default is 5, (nice =0, linux thread priority 120), domain is [1,10]
DVM thread id, NOT linux thread id
DVM thread Status:
ZOMBIE, RUNNABLE, TIMED_WAIT, MONITOR, WAIT, INITALIZING,STARTING, NATIVE, VMWAIT, SUSPENDED,UNKNOWN
"main" prio=5 tid=1 NATIVE
* 4# DVM thread status
group: default is “main”
Compiler,JDWP,Signal Catcher,GC,FinalizerWatchdogDaemon,FinalizerDaemon,ReferenceQueueDaemon are system group
sCount: thread suspend count
dsCount: thread dbg suspend count
obj: thread obj address
Sef: thread point address
group="main" sCount=1 dsCount=0 obj=0x4193fde0 self=0x418538f8
* #5 Linux thread status
sysTId: linux thread tid
Nice: linux thread nice value
sched: cgroup policy/gourp id
cgrp: c group
handle: handle address
sysTid=682 nice=-2 sched=0/0 cgrp=apps handle=1074835940
* #6 CPU Sched stat
Schedstat (Run CPU Clock/ns, Wait CPU Clock/ns, Slice times)
utm: utime, user space time(jiffies)
stm: stime, kernel space time(jiffies)
Core now running in cpu.
state=S schedstat=( 47858718206 26265263191 44902 ) utm=4074 stm=711 core=0
* #7-more Call Stack
1.3 几种常见的java backtrace
1.3.1 ActivityThread 正常状态/ActivityThread Normal Case
Message Queue is empty, and thread wait for next message.
"main" prio=5 tid=1 NATIVE
| group="main" sCount=1 dsCount=0 obj=0x4193fde0 self=0x418538f8
| sysTid=11559 nice=0 sched=0/0 cgrp=apps/bg_non_interactive handle=1074835940
| state=S schedstat=( 2397315020 9177261498 7975 ) utm=100 stm=139 core=1
at android.os.MessageQueue.nativePollOnce(Native Method)
at android.os.MessageQueue.next(MessageQueue.java:138)
at android.os.Looper.loop(Looper.java:150)
at android.app.ActivityThread.main(ActivityThread.java:5299)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:515)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:829)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:645)
at dalvik.system.NativeStart.main(Native Method)
1.3.2 Java Backtrace Monitor case
Synchronized Lock: 等待同步锁时的backtrace.
"AnrMonitorThread" prio=5 tid=24 MONITOR
| group="main" sCount=1 dsCount=0 obj=0x41fd80c8 self=0x551ac808
| sysTid=711 nice=0 sched=0/0 cgrp=apps handle=1356369328
| state=S schedstat=( 8265377638 4744771625 6892 ) utm=160 stm=666 core=0
at com.android.server.am.ANRManager$AnrDumpMgr.dumpAnrDebugInfoLocked(SourceFile:~832)
- waiting to lock <0x42838968> (a com.android.server.am.ANRManager$AnrDumpRecord) held by tid=20 (ActivityManager)
at com.android.server.am.ANRManager$AnrDumpMgr.dumpAnrDebugInfo(SourceFile:824)
at com.android.server.am.ANRManager$AnrMonitorHandler.handleMessage(SourceFile:220)
at android.os.Handler.dispatchMessage(Handler.java:110)
at android.os.Looper.loop(Looper.java:193)
at android.os.HandlerThread.run(HandlerThread.java:61)
1.3.3 执行JNI code 未返回,状态是native 的情况
"WifiP2pService" prio=5 tid=37 NATIVE
| group="main" sCount=1 dsCount=0 obj=0x427a9910 self=0x55f088d8
| sysTid=734 nice=0 sched=0/0 cgrp=apps handle=1443230288
| state=S schedstat=( 91121772 135245305 170 ) utm=7 stm=2 core=1
#00 pc 00032700 /system/lib/libc.so (epoll_wait+12)
#01 pc 000105e3 /system/lib/libutils.so (android::Looper::pollInner(int)+94)
#02 pc 00010811 /system/lib/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+92)
#03 pc 0006c96d /system/lib/libandroid_runtime.so (android::NativeMessageQueue::pollOnce(_JNIEnv*, int)+22)
#04 pc 0001eacc /system/lib/libdvm.so (dvmPlatformInvoke+112)
#05 pc 0004fed9 /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+484)
#06 pc 00027ea8 /system/lib/libdvm.so
#07 pc 0002f4b0 /system/lib/libdvm.so (dvmMterpStd(Thread*)+76)
#08 pc 0002c994 /system/lib/libdvm.so (dvmInterpret(Thread*, Method const*, JValue*)+188)
#09 pc 000632a5 /system/lib/libdvm.so (dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)+340)
#10 pc 000632c9 /system/lib/libdvm.so (dvmCallMethod(Thread*, Method const*, Object*, JValue*, ...)+20)
#11 pc 00057961 /system/lib/libdvm.so
#12 pc 0000dd40 /system/lib/libc.so (__thread_entry+72)
at android.os.MessageQueue.nativePollOnce(Native Method)
at android.os.MessageQueue.next(MessageQueue.java:138)
at android.os.Looper.loop(Looper.java:150)
at android.os.HandlerThread.run(HandlerThread.java:61)
1.3.4 执行object.wait 等待状态
"AsyncTask #1" prio=5 tid=33 WAIT
| group="main" sCount=1 dsCount=0 obj=0x427a8480 self=0x56036b40
| sysTid=733 nice=10 sched=0/0 cgrp=apps/bg_non_interactive handle=1443076000
| state=S schedstat=( 1941480839 10140523154 4229 ) utm=119 stm=75 core=0
at java.lang.Object.wait(Native Method)
- waiting on <0x427a8618> (a java.lang.VMThread) held by tid=33 (AsyncTask #1)
at java.lang.Thread.parkFor(Thread.java:1212)
at sun.misc.Unsafe.park(Unsafe.java:325)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:157)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2017)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:410)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1035)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1097)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:848)
1.3.5 Suspend 状态,通常表明是抓取backtrace 时,当时还正在执行java code, 被强制suspend 的情况
"FileObserver" prio=5 tid=23 SUSPENDED
| group="main" sCount=1 dsCount=0 obj=0x41fd1dc8 self=0x551abda0
| sysTid=710 nice=0 sched=0/0 cgrp=apps handle=1427817920
| state=S schedstat=( 130152222 399783851 383 ) utm=9 stm=4 core=0
#00 pc 000329f8 /system/lib/libc.so (__futex_syscall3+8)
#01 pc 000108cc /system/lib/libc.so (__pthread_cond_timedwait_relative+48)
#02 pc 0001092c /system/lib/libc.so (__pthread_cond_timedwait+64)
#03 pc 00055a93 /system/lib/libdvm.so
#04 pc 0005614d /system/lib/libdvm.so (dvmChangeStatus(Thread*, ThreadStatus)+34)
#05 pc 0004ae7f /system/lib/libdvm.so
#06 pc 0004e353 /system/lib/libdvm.so
#07 pc 000518d5 /system/lib/libandroid_runtime.so
#08 pc 0008af9f /system/lib/libandroid_runtime.so
#09 pc 0001eacc /system/lib/libdvm.so (dvmPlatformInvoke+112)
#10 pc 0004fed9 /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+484)
#11 pc 00027ea8 /system/lib/libdvm.so
#12 pc 0002f4b0 /system/lib/libdvm.so (dvmMterpStd(Thread*)+76)
#13 pc 0002c994 /system/lib/libdvm.so (dvmInterpret(Thread*, Method const*, JValue*)+188)
#14 pc 000632a5 /system/lib/libdvm.so (dvmCallMethodV(Thread*, Method const*, Object*, bool, JValue*, std::__va_list)+340)
#15 pc 000632c9 /system/lib/libdvm.so (dvmCallMethod(Thread*, Method const*, Object*, JValue*, ...)+20)
#16 pc 00057961 /system/lib/libdvm.so
#17 pc 0000dd40 /system/lib/libc.so (__thread_entry+72)
at android.os.FileObserver$ObserverThread.observe(Native Method)
at android.os.FileObserver$ObserverThread.run(FileObserver.java:88)
2. Native Backtrace
2.1 Native Backtrace 抓取方式
2.1.1 利用debuggerd 抓取
MTK 已经制作了一个利用debuggerd 抓取Native backtrace 的tool RTT(Runtime Trace), 对应的执行命令是:
rtt built timestamp (Apr 18 2014 15:36:21)
USAGE : rtt [-h] -f function -p pid [-t tid]
-f funcion : current support functions:
bt (Backtrace function)
-p pid : pid to trace
-t tid : tid to trace
-n name : process name to trace
-h : help menu
2.1.2 添加代码直接抓取
Google 默认提供了CallStack API, 请参考
system/core/include/libutils/CallStack.h
system/core/libutils/CallStack.cpp
可快速打印单个线程的backtrace.
2.2 解析Native Backtrace
你可以使用GDB, 或者addr2line 等 tool 来解析抓回的Native Backtrace, 从而知道当时正在执行的native 代码.
如addr2line 执行
arm-linux-androideabi-addr2line -f -C -e symbols address
3. Kernel Backtrace
3.1 Kernel Backtrace 抓取方式
3.1.1 运行时抓取
* AEE/RTT 工具
* Proc System
cat proc/pid/task/tid/stack
* Sysrq-trigger
adb shell cat proc/kmsg > kmsg.txt
adb shell "echo 8 > proc/sys/kernel/printk“ //修改printk loglevel
adb shell "echo t > /proc/sysrq-trigger“ //打印所有的backtrace
adb shell "echo w > /proc/sysrq-trigger“//打印'-D' status 'D' 的 process
* KDB
Long press volume UP and DOWN more then 10s
btp
Display stack for process
bta [DRSTCZEUIMA]
Display stack all processes
btc
Backtrace current process on each cpu
btt
Backtrace process given its struct task add
3.1.2 添加代码直接抓取
* #include
当前thread: dump_stack();
其他thread: show_stack(task, NULL);
3.2 Process/Thread 状态
"R (running)", /* 0 */
"S (sleeping)", /* 1 */
"D (disk sleep)", /* 2 */
"T (stopped)", /* 4 */
"t (tracing stop)", /* 8 */
"Z (zombie)", /* 16 */
"X (dead)", /* 32 */
"x (dead)", /* 64 */
"K (wakekill)", /* 128 */
"W (waking)", /* 256 */
通常一般的Process 处于的状态都是S (sleeping), 而如果一旦发现处于如D (disk sleep), T (stopped), Z (zombie) 等就要认真审查.
4. 几种典型的异常情况
4.1 Deadlock
下面这个case 可以看到PowerManagerService, ActivityManager, WindowManager 相互之间发生deadlock.
"PowerManagerService" prio=5 tid=25 MONITOR
| group="main" sCount=1 dsCount=0 obj=0x42bae270 self=0x6525d5c0
| sysTid=913 nice=0 sched=0/0 cgrp=apps handle=1696964440
| state=S schedstat=( 5088939411 10237027338 34016 ) utm=232 stm=276 core=2
at com.android.server.am.ActivityManagerService.bindService(ActivityManagerService.java:~14079)
- waiting to lock <0x42aa0f78> (a com.android.server.am.ActivityManagerService) held by tid=16 (ActivityManager)
at android.app.ContextImpl.bindServiceCommon(ContextImpl.java:1665)
at android.app.ContextImpl.bindService(ContextImpl.java:1648)
at com.android.server.power.PowerManagerService.bindSmartStandByService(PowerManagerService.java:4090)
at com.android.server.power.PowerManagerService.handleSmartStandBySettingChangedLocked(PowerManagerService.java:4144)
at com.android.server.power.PowerManagerService.access$5600(PowerManagerService.java:102)
at com.android.server.power.PowerManagerService$SmartStandBySettingObserver.onChange(PowerManagerService.java:4132)
at android.database.ContentObserver$NotificationRunnable.run(ContentObserver.java:181)
at android.os.Handler.handleCallback(Handler.java:809)
at android.os.Handler.dispatchMessage(Handler.java:102)
at android.os.Looper.loop(Looper.java:139)
at android.os.HandlerThread.run(HandlerThread.java:58)
"ActivityManager" prio=5 tid=16 MONITOR
| group="main" sCount=1 dsCount=0 obj=0x42aa0d08 self=0x649166b0
| sysTid=902 nice=-2 sched=0/0 cgrp=apps handle=1687251744
| state=S schedstat=( 39360881460 25703061063 69675 ) utm=1544 stm=2392 core=2
at com.android.server.wm.WindowManagerService.setAppVisibility(WindowManagerService.java:~4783)
- waiting to lock <0x42d17590> (a java.util.HashMap) held by tid=12 (WindowManager)
at com.android.server.am.ActivityStack.stopActivityLocked(ActivityStack.java:2432)
at com.android.server.am.ActivityStackSupervisor.activityIdleInternalLocked(ActivityStackSupervisor.java:2103)
at com.android.server.am.ActivityStackSupervisor$ActivityStackSupervisorHandler.activityIdleInternal(ActivityStackSupervisor.java:2914)
at com.android.server.am.ActivityStackSupervisor$ActivityStackSupervisorHandler.handleMessage(ActivityStackSupervisor.java:2921)
at android.os.Handler.dispatchMessage(Handler.java:110)
at android.os.Looper.loop(Looper.java:147)
at com.android.server.am.ActivityManagerService$AThread.run(ActivityManagerService.java:2112)
"WindowManager" prio=5 tid=12 MONITOR
| group="main" sCount=1 dsCount=0 obj=0x42a92550 self=0x6491dd48
| sysTid=898 nice=-4 sched=0/0 cgrp=apps handle=1687201104
| state=S schedstat=( 60734070955 41987172579 219755 ) utm=4659 stm=1414 core=1
at com.android.server.power.PowerManagerService.setScreenBrightnessOverrideFromWindowManagerInternal(PowerManagerService.java:~3207)
- waiting to lock <0x42a95140> (a java.lang.Object) held by tid=25 (PowerManagerService)
at com.android.server.power.PowerManagerService.setScreenBrightnessOverrideFromWindowManager(PowerManagerService.java:3196)
at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLockedInner(WindowManagerService.java:9686)
at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLockedLoop(WindowManagerService.java:8923)
at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLocked(WindowManagerService.java:8879)
at com.android.server.wm.WindowManagerService.access$500(WindowManagerService.java:170)
at com.android.server.wm.WindowManagerService$H.handleMessage(WindowManagerService.java:7795)
at android.os.Handler.dispatchMessage(Handler.java:110)
at android.os.Looper.loop(Looper.java:147)
at android.os.HandlerThread.run(HandlerThread.java:58)
4.2 执行JNI native code 后一去不复返
"main" prio=5 tid=1 NATIVE
| group="main" sCount=1 dsCount=0 obj=0x41bb3d98 self=0x41ba2878
| sysTid=814 nice=-2 sched=0/0 cgrp=apps handle=1074389380
| state=D schedstat=( 22048890928 19526803112 32612 ) utm=1670 stm=534 core=0
(native backtrace unavailable)
at android.hardware.SystemSensorManager$BaseEventQueue.nativeDisableSensor(Native Method)
at android.hardware.SystemSensorManager$BaseEventQueue.disableSensor(SystemSensorManager.java:399)
at android.hardware.SystemSensorManager$BaseEventQueue.removeAllSensors(SystemSensorManager.java:325)
at android.hardware.SystemSensorManager.unregisterListenerImpl(SystemSensorManager.java:194)
at android.hardware.SensorManager.unregisterListener(SensorManager.java:560)
at com.android.internal.policy.impl.WindowOrientationListener.disable(WindowOrientationListener.java:139)
at com.android.internal.policy.impl.PhoneWindowManager.updateOrientationListenerLp(PhoneWindowManager.java:774)
at com.android.internal.policy.impl.PhoneWindowManager.screenTurnedOff(PhoneWindowManager.java:4897)
at com.android.server.power.Notifier.sendGoToSleepBroadcast(Notifier.java:518)
at com.android.server.power.Notifier.sendNextBroadcast(Notifier.java:434)
at com.android.server.power.Notifier.access$500(Notifier.java:63)
at com.android.server.power.Notifier$NotifierHandler.handleMessage(Notifier.java:584)
at android.os.Handler.dispatchMessage(Handler.java:110)
at android.os.Looper.loop(Looper.java:193)
at com.android.server.ServerThread.initAndLoop(SystemServer.java:1436)
at com.android.server.SystemServer.main(SystemServer.java:1531)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:515)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:824)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:640)
at dalvik.system.NativeStart.main(Native Method)
===>
KERNEL SPACE BACKTRACE, sysTid=814
[] 0xffffffff from [] __schedule+0x3fc/0x950
[] __schedule+0xc/0x950 from [] schedule+0x40/0x80
[] schedule+0xc/0x80 from [] schedule_preempt_disabled+0x20/0x2c
[] schedule_preempt_disabled+0xc/0x2c from [] mutex_lock_nested+0x1b8/0x560
[] mutex_lock_nested+0xc/0x560 from [] gsensor_operate+0x1bc/0x2c0
[] gsensor_operate+0xc/0x2c0 from [] hwmsen_enable+0xa8/0x30c
[] hwmsen_enable+0xc/0x30c from [] hwmsen_unlocked_ioctl+0x2fc/0x528
[] hwmsen_unlocked_ioctl+0xc/0x528 from [] do_vfs_ioctl+0x94/0x5bc
[] do_vfs_ioctl+0xc/0x5bc from [] sys_ioctl+0x7c/0x8c
[] sys_ioctl+0xc/0x8c from [] ret_fast_syscall+0x0/0x40
[] from []