Skip to content

Apollo Avatar

Andi Lamprecht Andi Lamprecht ·· 5 min read· Accepted
ADR-0107 · Author: Sybil Melton · Date: 2025-02-07 · Products: uncrew
Originally ADR-0083-Apollo-Avatar (v4) · Source on Confluence ↗

Avatar

Context

Apollo (when uploading a mission), RPIC (operator/control) and Visual Observer (operator/control) need to communicate with the UAV. If they do this directly and do it simultaneously, the UAV will have to:

  • Triage and select whose commands to obey;

  • Authenticate and authorize. The web technology makes it possible in principle (JWT token), but:

    • doing so spills Apollo’s business logic onto the UAV. Meanwhile deploying new software to UAVs should have different controls than deploying it to the cloud. kubectl update deployment will not reach UAVs for long time to come (after our discussion with Google, we learnt UAVs should not be considered nodes in a Kubernetes cluster). Instead the UAV can be configured with a certificate to mutually authenticate with its Avatar.
    • The web technology makes authentication possible when communicating via HTTP, meanwhile we fear that TCP will not be a good carrier for telemetry and C2 in a high packet-loss radio network.
  • UAVs on cellular networks will typically hide behind NAT and it will be way easier for the UAV to establish one tunnel to its Avatar than to negotiate the correct NAT punching techniques with its operators - or a STUN server, because the operators are also behind NAT.

Decision

We postulate a proxy mediating between the operator and the UAV that we will call the “Avatar” (aka digital twin). We do so to stress that each UAV has a dedicated representation on the backend and that all communications with the UAV must pass the Avatar.

A “proxy” or “representation” can mean anything between a bit of state stored in RAM somewhere to a dedicated process. In order to disambiguate and put a bit more flesh into this ADR, we are inflating it with what could be a separate ADR all together: We assert that the Avatar is a dedicated, per-UAV process. We think we can model each as a Kubernetes deployment, but there are some risks related to this. For instance; creating a deployment that is to be contactable by the external world also requires an ingress resource and Kong cooperating in this process for possibly thousands of per-UAV deployments. This is exotic enough to state a concern and ask for PoCs. If not Kubernetes deployments, perhaps pods. If not pods, perhaps an intermediary or sub- orchestrator.

232aca23f2fb4806db87fdc611163606-Avatar.drawio.png

For each ongoing mission, there is a limited number of operators (let’s call them mission operators) ready to step in to assume command of the UAV and - at any time - one operator in actual command. The Avatar knows this somehow and only admits C2 commands from that one operator. Meanwhile it broadcasts telemetry to all the mission operators and to a pubsub for analytics and broader, non-safety critical consumption.

The following are not in scope and subject to followup ADRs:

  • The Avatar’s lifecycle;
  • The nature of the connection it establishes with the operators;
  • The nature of the connection it establishes with its UAV;

Consequences

Local Packet Radio

The presence of Avatar makes it difficult for a Visual Observer to connect to a fly-by UAV via a local radio, however:

  • We aim at near-always connected operations and if the Visual Observer cannot connect to the drone via the cellular network, we shouldn’t optimize for the local radio, but for taking the UAV away from blackout areas.
  • We still can open up the possibility for the VO (or anyone privileged) to circumvent the Avatar and introduce a blackout policy on the UAV itself - have it accept commands from the local radio if disconnected from the public Internet.

Multi-Pathing

We know that some form of a multi-pathing solution like the Elsight Halo will be at play so a proxy is necessary no matter what. If we can arrange for that multi-pathing to terminate at the Avatar, along with the secure tunnel, we can have minimize the network hops and centralize all communication infrastructure with it.

Self-Deploying Clusters

Typically deployments are created outside of clusters, but this ADR implies that we will have to learn on how do it from the inside.

Alternatives Considered

The sheer concept of an Avatar doesn’t prevent multiple Avatars occupy a single pod within a regular, horizontally scalable Kubernetes deployment. Here’s why we think this is a bad idea:

  • Dedicated processes isolate flights and hinder one process from affecting another.
  • Should someone got their hands on the SSD card of a fallen drone, they could impersonate it. Dedicated processes facilitate isolating access. A UAV connecting to its Avatar isn’t sharing the same network with other UAVs.
  • Having a standard multi-avatar deployment contradicts the constraint of having exactly one entity communicating with the UAV. That entity can no longer be the Avatar, but some other proxy. Consider the scenario of with 3 pods of the Avatars deployment. The operator comes in and by a sheer accident of load-balancing their request lands with the pod 3. VO comes and their request lands with pod 1. This means all three pods must be connected to the UAV and possibly more than one would have to process the same telemetry (per operator and per UAV).
  • Many one-avatar processes will consume more RAM than one multi-avatar process, but each one-avatar process will always consume less RAM than one multi-avatar process and so it will be easier for Kubernetes to balance its sheets and correctly (re-)allocate pods to nodes.
  • It will be relatively easy to pre-allocate all RAM a one-avatar-process will ever need and that will be way more difficult for multi-avatar processes. They will have to be n-avatar processes and force us into doing UAV->pod allocation and load-balancing. One is a special case of n.
Last updated on