Apollo Avatar Reconnect Snapshot
Originally
ADR-0096-Apollo-Avatar-Reconnect-Snapshot (v4) · Source on Confluence ↗Avatar Reconnect Snapshot
Context
The Uncrew Avatar has no local persistent storage. Telemetry and event logs are sent to it from the UAV. It receives these logs via gRPC messages from the mavlink_shim, which represents the UAV (at least at this point). This data is then forwarded to downstream clients which need up-to-date telemetry.
In the event of a connection loss between the mavlink_shim and Avatar, we need a scheme to ensure that telemetry data is preserved on the UAV and that the Avatar is able to fetch an immediate snapshot of all current telemetry data from the mavlink_shim when the connection is reestablished. In the case where certain teleemtry types only update under certain circumstances or have longer intervals, the most up-to-date data must be fetched immediately.
In the event of a connection loss between the Avatar a downstream client, the client must be able to fetch a snapshot of the most up-to-date data from the Avatar. If historical data is needed by the client, persistent storage will be used to cover the time of disconnect. For example, this could occur if a client such as HubOps, Mission Service, or Flight Logs needs to fill in gaps for notification and reporting of mission progress, and investigation into issues.
This ADR addresses the process and logic of how to retain UAV data when connection is lost, and when that connection is regained. The implementation details for what classifies as a disconnect or a successful connection is left as an implementation detail between each client of the Avatar (heartbeat, missed messages, timeout, etc). The mavlink_shim measures connectivity to the avatar based on successful receipt of RPCs.
Decision
- Upon connecting the UAV (represented by the mavlink_shim) and Avatar, the Avatar will send the mavlink_shim a configuration for each telemetry type, detailing the following: minimum rate of updates, publishing policy (always, on value change, threshold, threshold with minimum rate, etc), and whether to include the telemetry in a snapshot. This configuration is obtained by aggregating requirements from Avatar clients, before sending it to the shim. Upon receiving this configuration set, the mavlink_shim will send an initial snapshot of current telemetry values to the avatar according to the accepted configuration. Snapshot telemetry values are sent via the same RPC’s as typical telemetry. The Avatar will then save this snapshot in RAM, and update the snapshot values as new telemetry comes in over time.
- Upon a disconnect between the Avatar and UAV, a snapshot will also be sent from the shim to the Avatar detailing the current state of the UAV, again as configured above. Both the Avatar and mavlink_shim are aware when they become connected/disconnected. Since we use a backwards server-client relationship between them, the Avatar will send a command to get snapshot from the time of disconnect (effectively a pull), or we could have the mavlink_shim simply push it.
- Historical telemetry data is currently not being sent to the Avatar from the mavlink_shim on connect or reconnect. The only telemetry sent which will contain any kind of historical data will be the mission progress, which will be formatted to show the status of all mission items in one RPC.
- The mavlink_shim will be responsible for storing logs regarding commands, telemetry, and events for the UAV.
- Upon a disconnect between the Avatar and a downstream client, the Avatar will note the disconnect. When this connection is reestablished, the client may request a snapshot from the Avatar, or await polled telemetry. For telemetries with large data - such as a UAV-generated mission plan - a unique identifier will be used to denote a version of the large data. This identifier will be stored on the Avatar when it is first received and passed along with snapshot requests. If the version the Avatar claims to have and the version the UAV wants to send in the snapshot are the same, then that large data is not sent in order to reduce unnecessary network usage.
- In the event of the Avatar crashing and restarting, it will request telemetry configurations from its clients, and then request a snapshot from the mavlink_shim, cache it, and broadcast it to downstream clients.
- In the event of the mavlink_shim crashing and restarting, upon resestablishing the connection, the Avatar will send the telemetry configuration and the mavlink_shim will respond with a snapshot for the Avatar to cache and broadcast.
Scenarios
![]()
Consequences
Full UAV logs and messages will be stored by the mavlink_shim. In the event of a disconnect (either with mavlink_shim, PubSub, or downstream clients), the Avatar will request a snapshot from the mavlink_shim and then forward it to the required clients.
Alternatives Considered
Currently, only mission progress is required as “historical data” after reconnecting. We are getting around this by changing the format of the mission progress message to be more verbose about the status of all mission items in the plan in the telemetry RPC itself. However, if other telemetry types require historical data in the future, then these would need to be designed in a new way. Perhaps this would become part of the telemetry configuration or a new command.
Future Work
- The idea of snapshots from the mavlink_shim still needs to be designed and implemented.
- Telemetry configuration which is sent to the mavlink_shim will need to be designed such that the Avatar can aggregate the needs of its clients and communicate those needs to the shim.