Apollo Avatar Controller
Originally
ADR-0087-Apollo-Avatar-Controller (v4) · Source on Confluence ↗Avatar Controller
Context
We postulated the existence of the Avatar - a per-UAV Kubernetes deployment that mediates between the UAV and everything else. We didn’t however decide how:
- it comes/ceases to be;
- becomes connected to the UAV;
- its client learn about its existence and whereabouts;
Decision
Inventory
There shall be the DroneUp Inventory - an externally accessible (e.g.: inventory.droneup.cloud), per-environment (e.g.: inventory.test.droneup.cloud), where UAVs periodically check-in/announce their presence and become discoverable. A UAV is attached to an environment by being manually-/factory- provisioned with the Inventory’s host address and mTLS client certificate, with which they can authenticate themselves to the Inventory. All factory-provisioned UAVs could initially share the same client certificate, but on the first check-in the Inventory should generate/issue them personalized certificates. The Inventory is the only entity a UAV trusts out of the box.
![]()
The need to provision a new certificate is hopefully seen as well-founded, but this won’t happen often and to return the certificate in response to the periodic check-in sounds like mixing unrelated needs. It also seems like there will be things (other than certificates) that the Inventory will want to say to the UAVs. The Inventory is the only thing a UAV trusts so surely it’s the only thing that can tell it to trust something else (like its Avatar).
And since we don’t want to wait a couple of minutes for the UAV to work out that it needs connect to its Avatar, let’s replace the periodic check-in with a standing subscription for generic commands. gRPC, which we’re already investing into, is a good candidate communication mechanism for this.
![]()
And since each command would benefit from a response to it communicated back to the Inventory, we may be able to establish a request response RPC with the UAV acting like the server.
Inventory is not a Schedule
It may be tempting to see the Inventory as the registry of UAVs available to fly missions and thus as a roster. The Inventory service is posited to only advertise UAV’s presence. Assigning UAVs to missions (like assigning them for maintenance or like assigning Operators to missions) shall be part of operational scheduling, someone else’s job and out of scope for this ADR.
Avatar Controller
There shall be the Avatar Controller, a Kubernetes service directly responsible for managing the lifecycle of all Avatars. It decides when and where each is deployed (we want at some point deploy Avatar on Hub-local hardware) and when to tear it down. It hands a reference (external and internal hostnames) to whomever asks (subject to so-far-unspecified ACL policy).
![]()
AirBoss has to support UAVs lurking behind NAT and so necessarily UAVs have to initiate the connection to Apollo. This could be manifested by the UAV establishing a VPN tunnel to Apollo, finding itself on the same network as its Avatar and allowing the Avatar connect back to it. But how would the Avatar know when connect to the UAV? This is why we think the general case is: the Avatar should be listening for connections incoming from UAVs and UAVs initiate the outgoing connections towards the Avatar. This is where the Inventory fills the gap enabling connectivity between the UAV and its Avatar:
![]()
Avatars exist for the UAVs they represent and not for the missions that need flying. Therefore Avatars aren’t created when missions need them, they are created when UAVs check in to the Inventory and they stay there for as long as they are needed. An Avatar can only be retired (deployment removed) if it’s not assigned to a mission and if it hasn’t check-in for a longer while. This implies that the Inventory not only notifies about appearing UAVs, but also about disappearing ones. How the Avatar Controller knows the mission assignment is out of scope of this ADR, but an ADR needs to be proposed to answer this question.
When a Mission Service needs an Avatar (for a specific mission and UAV id) it asks the Avatar Controller. The Avatar Controller responds with the Avatar, if it has one. It won’t have one if there hadn’t been a UAV with this id checking-in and staying connected. We could arrange for the UAV id to determine the Avatar’s hostname, but the Avatar Controller will have no difficulties exposing a “Get Avatar” API, so we won’t have to arrange for meaningful hostnames. We still might.
Simulated UAVs
As physical UAVs, simulated UAVs are not AirBoss’s product. They come to existence off-band. Anyone with sufficient privileges to interact with the AirBoss’ k8s cluster can list them (kubectl get pods -n simfarm), create them an tear them down. Notably this can be testers, tests, operators and developers. One day we might build a frontend to expose this to an easier use. We might even be able to use GCP credentials directly in this frontend and get away without having to create an intermediary Simulator Controller.
Consequences
I keep on leaving this section empty. Maybe it’s not needed.
Alternatives Considered
Fleet management and OTA is handled today by the Auterion Suite. Auterion uses Mender to implement OTA, which has the UAV check-in periodically to check whether there is a new firmware for it to download and install. This is strikingly similar to the exchange we’ve proposed between the UAV and DroneUp Inventory so we had to consider piggy-backing on top of this existing process. We decide it’s not a good idea as it mixes concerns, puts additional dependency on Auterion (which may not be in the picture for ever) and the Mender check-ins have insufficient frequency.