Tombstones (especially ones with lots of VMAs) are regularly truncated.
We can at least show the number of VMAs, though, for anyone interested
in knowing whether they got close to the default 64Ki limit.
Bug: http://b/66911122
Test: ran crasher, examined tombstone
Change-Id: I286db66f28f132307d573dbe5164efc969dc6ddc
Apply the same fix from c2e98f63 to intercept_manager.cpp.
Bug: http://b/64543673
Test: debuggerd_test
Change-Id: Ibfb919e059fa62f8336cfc1426d03ef015590136
If a function crashes by jumping into unexecutable code, the old method
could not unwind through that. Add a fallback method to set the pc from
the default return address location.
In addition, add a new finished check for steps. This will provide a method
to indicate that this step is the last step. This prevents cases where
the fallback method might be triggered incorrectly.
Update the libbacktrace code to unwind using the new methodology.
Update the unwind tool to use the new unwind methodology.
Add a new option to crasher that calls through a null function.
Create a new object, Unwinder, that encapsulates the a basic unwind. For now,
libbacktrace will still use the custom code.
Added new unit tests to cover the new cases. Also add a test that
crashes calling a nullptr as a function, and then has call frames in
the signal stack.
Bug: 65842173
Test: Pass all unit tests, verify crasher dumps properly.
Change-Id: Ia18430ab107e9f7bdf0e14a9b74710b1280bd7f4
Android.bp assumed only an armv7-a-neon core needs to set HAS_VFP_D32.
In fact, an armv8 core also has 32 double-word floating point registers
for A32 and T32 ISAs (AArch32 or 32-bit armv8).
Bug: 65568426
Test: lunch aosp_arm64; emulator # on oc-mr1-dev; boot to home screen.
Check crashglue.o actually uses VFP_D16-31 for 32-bit armv8 core.
Change-Id: I34584a27fa24a55bb4809ccd7f99a8122971df0e
The order of arguments is wrong - we're passing flags=static_cast<unsigned>(-1)
and backlog=LEV_OPT_CLOSE_ON_FREE (which is 2).
On versions of libevent prior to 2.1.8, this ends up accidentally setting
OPT_LEAVE_SOCKETS_BLOCKING, OPT_CLOSE_ON_EXEC, OPT_REUSABLE and OPT_THREADSAFE
and limiting our backlog to two. These unintentional changes are relatively
benign; we never make our sockets block, we never exec, we never reuse
sockets and the additional locking overhead should be negligible. The
backlog of two might be a problem in theory, but there haven't been any
reports of issues caused by it.
Things get worse on 2.1.8 - that version introduces several new flags,
one of which is OPT_DISABLED. This disables the new listener by default,
which means that our event loop returns early because it has no active listeners
for any of its events.
Bug: 64543673
Test: Manual.
Change-Id: I9954bc7fe1af761de1a950d935dd2e6ce7e2c5f5
Move libdebuggerd headers into their own directory for namespacing,
move some includes to the top of their implementing files, delete some
dead code.
Test: mma, treehugger
Change-Id: Ie4c44e32e2ab3bc678092899d257fd4ed634aa34
Instead of printing a useless "ptrace attach failed: strerror(EPERM)"
message, print the name and pid of a competing tracer when we fail to
attach because a process is already being ptraced.
Bug: http://b/31531918
Test: debuggerd_test32, debuggerd_test64 on aosp_angler
Test: strace -p `pidof surfaceflinger`; debuggerd -b surfaceflinger
Change-Id: Ifd3f80fe03de30ff38c0e0068560a7b12875f29d
In debuggerd, when dumping a tombstone, run the new unwinder and verify
the old and new unwinder are the same. If not, dump enough information
in the tombstones to figure out how to duplicate the failure.
Bug: 23762183
Test: Builds, ran and forced a mismatch and verified output.
Change-Id: Ia178bde64d67e623d4f35086ebda68aebbff0c3c
- Change the field name load_base to load_bias (which is what it really is).
- Add a rel_pc field so that callers do not need to compute it themselves.
- Remove the BacktraceMap::GetRelativePc() since nobody should need to
compute this themselves.
Bug: 23762183
Test: Compiles and unit tests pass (debuggerd, libbacktrace).
Change-Id: I2cb579767120adf08c407a58f3c487ee3f2b45fc
Print diagnostics when the user requests a dump that is guaranteed to
fail, such as trying to dump a process you can't send a signal to.
Bug: http://b/63008395
Change-Id: I5c6bf2a5751f858e0534990b8d2ab6932eb9f11d
Test: manually tested
Some processes have lots of threads and minidebug-info. Unwinding
these can take more than the original two seconds.
Bug: 62828735
Test: m
Test: debuggerd_test
Test: adb shell kill -s 6 `pid system_server`
Change-Id: I0041bd01753135ef9d86783a3c6a5cbca1c5bbad
The debuggerd_client.race tests the crash_dump process to finalize the
killed process within 2 seconds. The 2 seconds timeout for finalizing a
process, which has 1024 threads, is bit small for low-speed devices.
This CL lowers the bar in order to make such devices pass the test.
Wraping up 128 threads within 2 seconds looks safe.
Bug: 62600479
Test: debuggerd_test passes on low-speed devices.
Change-Id: I3089415961422e6933405d2c872913273425caff
For java traces, log the kind of dump as well as the PID of the
completed dump. This makes it easier to correlate dump requests with the
actual file they're written to.
Sample log statement:
E /system/bin/tombstoned: Traces for pid 4737 written to: /data/anr/trace_00
The message for native traces / tombstones remains unchanged because
several tools parse it.
Test: manual
Bug: 32064548
Change-Id: I7b3792dd5ae312ee0bc055c22ec3f7c747152072
The only case where tombstoned creates files for java traces is
when the process is signalled "by hand" using "shell kill -3", or
by the program itself. Such traces do not correspond to an ANR, so
name those files "trace_XX".
When dumpstate / system_server want to dump java traces, they set up
a tombstoned intercept and manage the lifetime of any associated file
that themselves.
Bug: 32064548
Test: manual, debuggerd_test
Change-Id: I97006ec7c0cd35de4b9564f535e77af846cc3891
Example:
signal 5 (SIGTRAP), code -32763 (PTRACE_EVENT_STOP), fault addr 0x274e00005fb3
I'm tempted to say that %d isn't the best choice for si_code, but as long as
we're fully decoding all the values, I don't think it matters.
Bug: http://b/62856172
Test: manual debuggerd run
Change-Id: Ieeca690828e1e12f4162bbadece53f4aa7b9537a
Kernel can use vsyscall for system calls. The vsyscall implementation in
the kernel gives one more depth in the backtrace. It leads to failures
on CrasherTest. This CL makes tests find a system call frame not only in
the first line but also in all lines on the backtrace.
Bug: 62600694
Test: passes all CrasherTests.
Change-Id: Ice383bb94db097e7e9a9e4f74d8fa5ecc528122a
Make it easy to find out where a specific crash's tombstone was written
to by adding a log.
Bug: http://b/62268830
Test: crasher
Change-Id: I1961dfb19f76a42a8448ebafd4be153b73cb6800
Don't pause the threads we're going to dump until after we're about to
fetch their backtraces.
Bug: http://b/62112103
Test: debuggerd_test
Change-Id: Id7ab0464842b35f98f3b3ebc42fb76161d8afbd2
Add some tracing to figure out where time is going during a dump.
Bug: http://b/62112103
Test: systrace.py sched freq idle bionic
Change-Id: Ic2a212beeb0bb0350b4d9c2cd7a4e70adc97752d
The SELinux changes that this depends on have now landed.
This change also adds a few lower level unit tests of intercept
functionality.
Test: make; debuggerd_test
Change-Id: I0be5e85e7097e26b71db269c9ed92d9b438bfb28
If a process tries to dump itself (e.g. system_server during ANRs),
crash_dump will block trying to write to its pipe if it's not
sufficiently large. Increase the pipe size to the max, and add a test
to make sure that it's always at least 1MB (the default value).
Bug: http://b/38427757
Test: debuggerd_test
Change-Id: Iddb0cb1e5ce9e687efa9e94c2748a1edfe09f119
crash_dump inherits its signal mask from the thread that forked it,
which always has all of its signals blocked, now that sigchain respects
sa_mask.
Manually clear the signal mask, and reduce the timeout to a
still-generous 2 seconds.
Bug: http://b/38427757
Test: manually inserted sleep in crash_dump
Change-Id: If1c9adb68777b71fb19d9b0f47d6998733ed8f52
All intercept requests and crash dump requests must now specify a
dump_type, which can be one of kDebuggerdNativeBacktrace,
kDebuggerdTombstone or kDebuggerdJavaBacktrace. Each process can have
only one outstanding intercept registered at a time.
There's only one non-trivial change in this changeset; and that is
to crash_dump. We now pass the type of dump via a command line
argument instead of inferring it from the (resent) signal, this allows
us to connect to tombstoned before we wait for the signal as the
protocol requires.
Test: debuggerd_test
Change-Id: I189b215acfecd08ac52ab29117e3465da00e3a37
.. for ART and the frameworks to link against. In the new stack dumping
scheme (see related bug), the Java runtime will communicate with
tombstoned in order to obtain a FD to which it can write its traces.
Also move things around to separate headers that are private
implementation details from headers that constitute the public debuggerd
API. There are currently only three such headers :
- tombstoned/tombstoned.h
- debuggerd/client.h
- debuggerd/handler.h
Bug: 32064548
Test: make
Change-Id: If1b8578550e373d84828b180bbe585f1088d1aa3