Skip to content
Apollo Flightlog

Apollo Flightlog

Andi Lamprecht Andi Lamprecht ·· 7 min read· Accepted
ADR-0117 · Author: Sybil Melton · Date: 2025-02-07 · Products: uncrew
Originally ADR-0089-Apollo-FlightLog (v3) · Source on Confluence ↗

Flight Log

Context

Flight Log, as the name suggests, is the complete log of all events that happen in the context of a flight. The Log has two parts to it:

  • On-board Flight Log - the complete log from the perspective of the UAV. It will contain every aspect of system operation (including sensors output) and as such it will be too large to be put on the wire/transmitted as the UAV flies. The on-board log will have to be harvested on the ground when the UAV returns from the mission. The on-board log consists of:

    • the PX4 Flight Controller log in the ULOG format (*.ulg) capturing all events occuring on Skynode. Each log session (resulting in a single *.ulg file) commences on arming and stops on disarming. This is why it’s correct to call it a Flight Log (and not, say Flight Log).
    • any non-PX4 events related to MAVLINKShim, Perception, Precision Landing or any future sub-system executing on the UAV’s generic compute.
  • Avatar Flight Log - the log from the perspective of the Avatar capturing every Operator action, observed UAV connection parameters and everything the UAV chooses to share during flight (at least basic telemetry).

The Flight Log isn’t a file. It is a sequence of events that happen on the axis of time and can be consumed in real time or captured on some medium resembling a file and consumed later.

The Flight Log has many good reasons to exist:

  • Compliance & Auditing - as a drone operator we (DroneUp) are obliged to keep all flight logs for inspection and any future incident/accident investigations and keep them virtually forever.

  • Analytics - as a drone operator we should keep the flight logs for internal analytics so that we can answer questions similar to those:

    • How much do our flights deviate from their plans? - this so that we can:

      • tighten the plans and fit more traffic in the air;
      • improve our ETA estimations.
      • understand how weather or terrain affect our operations;
    • How much time do our flights spend in C2-less conditions.

    • How many missions have been interrupted by the Operator action?

  • Real-time Traffic Sharing - as a drone operator interested in safety of our own flights we should be sharing our traffic with the nearby UTM system and thus with all other traffic participants so that the are aware of our operations in real time.

    • The special case for this is sharing this telemetry with our clients who may need to share it with theirs - with a fancy drone icon moving on their screens.
  • Operations and Alerting - as a drone operator we should want to observe all flights as they happen and continue notice peaks and throphs in completed or interruped missions to have a good perspective on daily operation performance.

Decision

This leads to the following architecture:

25736e13a7a26d468613b104ce974dd6-FlightLog.drawio.png

I.e.:

  • All Avatars publish every event they observe into a PubSub topic relating to the ongoing flight.
  • MissionsService subscribes to the topics in order to service the clients’ calls interested in the progress of missions executed by said flights.
  • Auto-scaling Traffic Publisher subscribes to the all flights and forwards their telemetry to a UTM system making DroneUp flights observable by external parties;
  • All topics are written to Google Cloud Storage in a format suitable for Data Engineering tools, i.e. json.
  • Probably within the same (automated or otherwise) post-flight procedure that takes care to take out the UAV’s battery for charging, the On-board Flight Log is extracted from the UAV’s SD card and uploaded to the same Google Cloud Storage bucket.

Alternatives Considered

Flight Log vs Application Log

It may be tempting to consider the FlightLog part of the Application Log and give it the same treatment. An Application Log is part of the practice software engineers call Observability with which the software engineers are able to diagnose the behaviour of the code they have written, investigate incidents and confirm their assumptions about application performance.

Can the Flight Log be treated the same as the Application Log?

DroneUp’s GKE logging is set up by using the GCP Cloud Logging which:

Deploys a per-node logging agent that reads container logs, adds helpful metadata, and then sends the logs to the logs router, which sends the logs to Cloud Logging and any of the Logging sink destinations that you have configured.

FlightLog could be implemented using the GCP Cloud Logging by additionally routing every log message every Avatar emits to:

  • a PubSub so that the MissionsService and TrafficPublisher can subscribe to it;
  • a Cloud Storage buckets: Provides storage of log data in Cloud Storage. Log entries are stored as JSON files. For information about viewing logs routed to Cloud Storage, see View logs routed to Cloud Storage.

The pros/cons for having Avatar writing to the Google Cloud Logging (vs PubSub) are as follows:

To Google Cloud Loggingto PubSub
👍 Routing Application Logs to Google Cloud Storage is easily configured and demands no additional resources.👎 Forwarding PubSub topics to Google Cloud Storage is not difficult, but requires summoning an additional GCP product: Dataflow (aka Apache Beam), which is a Data (and not Software) Engineering competence.
👎 Using Google Cloud Logging implies json format on input and output. Uncrew is a Protobuf shop and having Avatar turn Protobufs into json so that MissionConsole turns it back to Protobuf is wasteful. Working with json on the Google Cloud Storage is less of a problem, but probably not the format we ultimatly want to store (e.g.: parquet).👍 Controlling format with Apache Beam is easy
👎 Before forwarding to the sinks, the Log Router will buffer the log entries for an amount of time that may not be optimised for real-time consumption and an amount of time we will never control.👍 Writing directly to PubSub avoids the Google Cloud Logging delays
Application Logs have a different audience that operational logs. If we compare Uncrew to amazon.com, Flight Logs are more like the buyer’s purchase record than the amazon.com Application Logs. I.e.: they are useful for the analytics and the client themselves, but not to amazon.com engineers diagnosing the code. By equating the Application and Flight Logs solely in the scope of the Avatar we minimize the risk of the audience mixup, but the risk still exists. E.g.: Avatar booting up in response to a newly connected UAV is not a Flight Log entry.

To sum up; Going with Google Cloud Logging is easier, quicker and cheaper, but seems less correct and more risky than having the Avatar write to PubSub directly.

MissionsService and Flight Log

It is a requirement for the MissionsService to share some of the mission’s telemetry with the client who requested the mission. If we decide for the UAV to emit telemetry at 10Hz and the MissionsService contract is to forward telemetry to the client only every 10m, then we may be putting a significant processing burden on the MissionService (with little horizontal scaling pressure so far).

Introducing an alternative (gRPC) interface for Avatar to expose to the MissionsService (or asking the MissionsService to to use the same interface as the pilots use) now shifts the burden onto the Avatar, who now needs to thin out and write telemetry to yet another sink (beyond pilots and Flight Log). Maybe that’s ok. Maybe the pilots wish to see less telemetry than that which needs to be written to the Vehicle Log. The Avatar is already scaled out to meet the burden of processing the incoming telemetry, it likely has enough headroom to do this extra bit.

Should this risk materialize, should we start seeing more MissionService pods pop up because of needing to thin out the telemetry before forwarding it to their clients, we could shift this burden pretty much at a whim as doing so will be very easy, but still, slightly more difficult than having the MissionsService consume the Flight Log. This is because asking the MissionsService to use the same interface as the Pilot will force us to reconcile the authentication (right now divided between external and internal MustRunNoAuth) - we do have plans to do it, but no pressure to do it any time soon.

Last updated on