Skip to content
Subscription Details

Subscription Details

Andi Lamprecht Andi Lamprecht ·· 3 min read· Accepted
ADR-0083 · Author: Sybil Melton · Date: 2025-02-07 · Products: shared
Originally ADR-0073-SUBSCRIPTION-DETAILS (v4) · Source on Confluence ↗

UTM Airspace Subscription

Invalid Image Path

Context

We have a complex pipeline consisting of multiple services designed to handle UTM airspace subscription updates. The pipeline needs to be reliable, performant, and maintainable.

Decision Drivers

  • Data integrity
  • Performance
  • Scalability
  • Fault tolerance

Decision

Continue with the current design, while addressing the challenges and questions listed in each component.


Components

1. Atlas Pipeline

Responsibilities

  • Source of data
  • Runs at semi-random intervals
  • Creates, Updates, and Deletes sent to pub/sub

2. Charger

Responsibilities

  • Receives authenticated push updates from pub/sub

  • Create, update, and delete data in PostGIS airspace table

    • airspace table is used for fast queries

Questions/Challenges

  • Pub/sub message format changes.

    • Example: elevations were being sent as a single value in v1, v2 introduced a map for elevation to include unit and ref in addition to value
    • We’ve implemented message versioning to handle schema changes
    • All versions are currently being handled, we will support the last 2-3 versions in the future
    • Versioning allows for the decoupling of deployment of the relevant producer and consumer services
  • Some messages have had invalid or un-parsable data, i.e., invalid or problematic geometries

    • Validations are done at the http endpoint connected to the pub/sub. Any invalid messages return an appropriate error response, and is observable via honeycomb

    • Data quality pipelines run periodically to check our datasets at each level in our medallion data architecture

      • Great Expectations is used to monitor and alert (via slack)
    • Additional validations have been added as invalid messages are monitored and logged

    • Thread-unsafe parsing library was found to be a cause of the remaining invalid geometries

  • Reconciling all airspace data between pub/sub and PostGIS

  • Ensuring complete update propagation

    • How do we know we are getting all updates?

      • Additional observability of the pipelines is needed
      • Need for a proof-of-life mechanism, or notifications

3. UTM Sharing (Redis)

Responsibilities

  • Inform all pods that an airspace has been updated

  • Currently implemented with Redis

    • lightweight
    • offers a variety of options (streams, pub/sub, list)
    • no configuration required

Challenges

  • Stream vs. pub/sub

  • Redis failure scenarios

    • Updates skipped
    • Clients need to be notified - close connection
    • Maybe okay? Client can reconnect with offset on connection
  • Verifying completeness of received updates

    • Periodically query Atlas airspace for recent changes and compare count with updates received
    • Close client connections if problem is identified

4. UTM API Subscription API

Responsibilities

  • gRPC API for internal customers (but also appropriate for public use)

  • Offers offset parameter to request updated airspaces immediately

    • important to back-fill lost connections
  • Continuous update stream

Questions/Challenges

  • Handling broken streams

    • Inform client by closing stream
    • Client can reconnect with offset to fill in missing data
  • Need for client ‘proof of life’

    • Include update count in heartbeat message?

Questions for Review

  1. Are all challenges sufficiently addressed?
  2. Do we need a failover for Redis?
  3. Is there a need for more stringent proof-of-life mechanisms?
  4. Improved observability, particularly in the data pipelines
Last updated on