Change the VS AIDL to define it as an enum. Added a new common AIDL
module to hold it, since this is needed by both existing AIDLs, and we
don't want either directly depending on the other.
Modify vmclient to also have Rust enum for ErrorCode, with translation
logic, in the same way as we did for DeathReason. Switch to Debug
rather than Display for these enums.
Also fixed a couple of AIDL warnings (enum-explicit-default).
We also need translation in the Java API, but I'll do that as part of
https://r.android.com/2192077.
Bug: 236811123
Test: atest MicrodroidTests MicrodroidHostTestCases
Test: composd_cmd test-compile
Change-Id: I561e5f43f0f1c74d1318dc41782ed390bb5f0337
Instead of having clients directly register a callback with VS,
implement a Rust level callback interface in vmclient. This saves an
extra binder call on each notification, a bunch of boilerplate code,
and allows us to provide a slightly better interface (e.g. we can use
the Rust DeathReason enum, as elsewhere in vmclient, for instantly
better logging).
I also replaced all our usages of <some_interface>::binder::{...} with
direct access to binder::{...}. That makes it clearer what depends on
the interface itself and what is just generic binder code. I realise
this should be a separate change, but I only realised that after doing
bits of both.
Test: composd_cmd test-compile, observe logs (on both success & failure)
Test: atest -b (to make sure all our tests build)
Test: Presubmits
Change-Id: Iceda8d7b8f8008f9d7a2c51106c2794f09bb378e
Rather than killing the VM when we are done with it, ask the payload
to exit, and wait for the VM to have full exited. This allows the VM
time to write logs, generated crash dumps etc if there has been a
failure.
Add a quit method to the CompOS service, so clients can request it to
exit. Add the ability to wait for a VM to have died with a timeout to
vmclient. Implement a wait for shutdown in compos_client that waits
for the VM to exit but terminates it abruptly if it doesn't do so in a
reasonable time; do the same thing if the VM fails to start.
Change compos_verify to use this method to wait for the VM to have
fully exited once we are done with it.
Assorted refactorings:
- Simplify the timeout handling code so we panic if the neccessary
property isn't available (all requests would fail anyway). Also
updated the timeouts a little.
- Rename get_service to connect_service (it's definitely not a simple
getter).
I haven't dealt with compilation yet; that will have ramifications all
the way up to Java, and this CL is big enough already. Additionally I
haven't yet attempted to allow odsign to continue while we wait for
the VM to exit.
Bug: 236581575
Test: Run VM, see both finished & died in the logs.
Change-Id: I47501081d23833fe7ef791240161d93e38b3d951
When a kernel panic occurs in a pVM and the ramdump is enabled there, a
ramdump file is generated.
This file should eventually be consumed by the client (the owner of the
VM) for further analysis. VirtualizationService let its client know that
a ramdump has been created and provide access to it.
Specifically, the end-to-end flow is as follows:
1) When starting a VM, an empty file is created under the VM-specific
directory (`/data/misc/virtualizationservice/<cid>/ramdump`).
2) The file becomes a backing store for a virtio-console device
(/dev/hvc3).
3) When a kernel panic occurs, the `crashdump` binary is executed and
the `/proc/vmcore` is written to `/dev/hvc3`. After the dump is done,
the VM triggers a reboot which is kills crosvm.
4) Virtualizationservice is notified with the exit of crosvm. It then
checks the size of the ramdump file. If that is not empty, it can
assume that a ramdump was occurred in the pVM.
5) Then virtualizationservice notifies the event via
`IVirtualMachineCallback.onRamdump(ParcelFileDescriptor)`, where the
parcel file descriptor is the handle to the ramdump file.
6) Client reads the ramdump file.
This change also adds `--ramdump` option to the `vm` tool to designate
the path where ramdump is saved to.
Bug: 238278104
Test: follow the steps. Automated tests will be added in a followup CL
1) Run a pVM:
adb shell /apex/com.android.virt/bin/vm run-app --debug full --mem 300 \
--ramdump /data/local/tmp/virt/myramdump \
/data/local/tmp/virt/MicrodroidDemoApp.apk \
/data/local/tmp/virt/apk.idsig /data/local/tmp/virt/instance.img \
assets/vm_config.json
2) Adb shell into the VM
adb forward tcp:8000 vsock:10:5555
adb connect localhost:8000
adb -s localhost:8000 root
adb -s localhost:8000 shell
3) Load the crashdump kernel
/system/bin/kexec \
/system/etc/microdroid_crashdump_kernel \
/system/etc/microdroid_crashdump_initrd.img \
"1 rdinit=/bin/crashdump nr_cpus=1 reset_devices console=hvc0 earlycon=uart8250,mmio,0x3f8"
4) Trigger a crash
echo c > /proc/sysrq-trigger
5) Check the ramdump at /data/local/tmp/virt/myramdump
Change-Id: I1f90537961632708ca5a889cdd53390030518bb8
Hangup in Microdroid is defined as a state where payload hasn't been
started for a long time. In that case AVF kills the VM and the death is
reported via onDied callback.
In addition, modified the client-facing java and rust libraries to add
death reasons that were added before but haven't surfaced yet.
Bug: 222228861
Test: I couldn't make a test for this because it was impossible to
intentionally make the hang by a test. Instead, I confirm that `onDied`
is called and the VM eventually is killed when I edited the timeout
value to a very small number (e.g. 100ms).
Change-Id: I53f232d0b609e6e8a429d996c7d6fdd0b37e7b4c
This reduces code duplication, and will also be useful for Rust tests.
Test: ComposHostTestCases compos_key_tests
Change-Id: I13c41d3b2bbe506495b723e7739f3181cb033f0f