Skip to content
Server Side Telemetry Processing Strategy

Server Side Telemetry Processing Strategy

Andi Lamprecht Andi Lamprecht ·· 13 min read· Draft
ADR-0268 · Author: Oleksii Naboichenko · Date: 2026-02-26 · Products: uncrew
Originally ADR-0129-Server-Side-Telemetry-Processing-Strategy (v9) · Source on Confluence ↗

Server-Side Unit Conversion & Calculations

Decision Summary

We will move altitude graph telemetry preparation to server-side presentation endpoints and keep all mutation APIs strict SI.
This update adds mandatory plan-version segmentation, snapshot ordering, self-describing per-series unit metadata, and a normative downsampling strategy.

Context

Screenshot 2026-01-05 at 1.16.45 PM.png

The current frontend altitude graph path performs plan normalization, terrain alignment, drone history projection, and route merge locally.

Primary current paths:

  • src/queryProviders/useFlyGraphData/index.ts
  • src/queryProviders/useFlyGraphData/hooks/useFullGraphData.ts
  • src/queryProviders/useFlyGraphData/utils/getFullRouteData.ts

Unit conversion is currently frontend-owned via local preference:

  • src/config/units.ts
  • src/utils/altitudeConverter.ts
  • src/utils/distanceConverter.ts

Performance Baseline (Current State)

Screenshot 2026-01-05 at 1.17.05 PM.png

MetricCurrent ValueTargetWhy?
Points per typical 20-min mission~3,200<960 (70%+ reduction)TBD
Uncompressed FE payload (full graph)~450 KB<180 KBTBD
FE merge/transform time (p95)~120 ms<10 ms (thin adapter only)TBD

Observed constraints:

  1. Performance pressure from 3000+ telemetry points and frequent updates.
  2. Replanning during flight introduces ambiguity when FE matches by waypoint index only.
  3. Cross-client consistency risk when formulas and rounding are repeated on each platform.
  4. Flight-safety requirement that write paths remain unambiguous and canonical SI.
  5. The current altitude graph has no interactive zoom/brush flow; it renders the full extent directly in Recharts.

Decision Drivers (Ranked)

  1. Replan safety (blocking) — Replanning during flight causes index-remap errors that can mislead operators. This is the primary driver; without plan-version segmentation, the graph is unreliable during the most critical phase of flight.
  2. Cross-client consistency (high) — Three platforms(if mobile or desktop needs), duplicating conversion formulas leads to divergent rounding and unit bugs. Centralizing on the server eliminates this class of defect.
  3. Performance (high) — 3,200+ raw points per mission cause unnecessary FE computation and network overhead. Downsampling is required for acceptable render performance.
  4. SI write-path integrity (non-negotiable) — Mutation APIs must never accept ambiguous units. This is a safety invariant, not an optimization.
  5. Safe rollout (required) — Any change to the telemetry display path must be incrementally rollable with instant revert.

Local algorithm benchmarks indicate that Go outperforms TypeScript significantly in arithmetic and collection processing.

Specifically, for collection processing, Go is roughly ~27x faster (0.33ms vs 9.00ms). This suggests that heavy data manipulation belongs on the server.

Screenshot 2026-01-05 at 1.24.31 PM.png

However, moving all unit awareness to the server introduces risks regarding data integrity on write operations and tight coupling of UI concerns to domain APIs.

Problems To Solve

  1. Drastically reduce point count while preserving graph shape and critical events.
  2. Represent dynamic plan changes without index remap errors.
  3. Standardize unit conversion and rounding rules across clients.
  4. Preserve strict SI semantics for writes/commands.
  5. Provide safe rollout and deterministic fallback.

Data Flow: Current vs. Proposed

CURRENT (client-heavy):
┌────────┐    raw telemetry     ┌──────────────────────────────────────┐    render
│  gRPC  │ ──────────────────►  │  FE: merge, align, convert, project │ ──────────►  Recharts
│ Server │   (~3200 pts, SI)    │  (useFlyGraphData pipeline)          │   (~3200 pts)
└────────┘                      └──────────────────────────────────────┘

PROPOSED (server-heavy):
┌──────────────────────────────────────────┐   graph-ready    ┌──────────────┐    render
│  Server: segment, downsample, convert,   │ ──────────────►  │  FE: adapter  │ ──────────►  Recharts
│  annotate (presentation endpoint)        │  (<960 pts)      │  + render     │   (<960 pts)
└──────────────────────────────────────────┘                  └──────────────┘

Decision

1) Read Path Becomes Server-Heavy

Introduce presentation/graph telemetry endpoints that return graph-ready, downsampled, optionally unit-converted series.

The frontend will focus on:

  1. request orchestration
  2. rendering
  3. loading/error states
  4. minimal adapter mapping

1) Primary Optimization: Downsampling Is Mandatory

For altitude graph series, server-side downsampling is required before response.

Chosen base algorithm:

  1. Ramer-Douglas-Peucker (RDP) in screen-space approximation (pixel-aware), not raw meter-space.

Mandatory anchors that must always be kept:

  1. segment start/end
  2. plan_change_at points
  3. mission-state transition points (warning/contingency enter/exit)
  4. first and last actual telemetry point in each segment
  5. local extrema (peak/valley) candidates for actual and planned series

Pipeline:

  1. split by plan_version_id
  2. inject anchors
  3. run RDP between anchors only
  4. enforce max_points cap as final guard

Default parameters (v1):

  1. epsilon_y_px = 0.75 (vertical tolerance — sub-pixel on 2× DPR displays, ensuring no visible shape loss at standard retina resolution)
  2. epsilon_x_px = 0.5 (horizontal tolerance — tighter than vertical because horizontal misalignment is more perceptible in time-series graphs)
  3. max_points = min(1200, max(300, 2 * viewport_width_px)) (scales with screen width; 2× factor provides comfortable headroom above 1-point-per-pixel)

These values were derived from visual regression testing on representative 20-minute missions at 1920×1080 and 2560×1440 resolutions. They should be re-validated during Phase 2 parity testing and may be tuned per device_pixel_ratio.

Operational expectation:

  1. long near-flat drift intervals are compressed to boundary points (typically 2-4 points per interval)
  2. sharp altitude events and state transitions remain visible due to anchor pinning

Demo

Screenshot 2026-02-25 at 11.22.51 PM.pngScreenshot 2026-02-25 at 11.23.02 PM.pngScreenshot 2026-02-25 at 11.23.17 PM.png

2) Write Path Remains Strict SI

All writes/commands remain canonical SI units only.
No unit interpretation based on UI preference is allowed for mutation endpoints.

3) Response Unit Selection Is Presentation-Only

response_unit_system may be provided only for presentation endpoints.

Rules:

  1. Core CRUD/entity APIs always return SI.
  2. Presentation endpoints may return converted values.
  3. If unspecified, the default is metric.
  4. Response must echo the effective unit system.

4) Plan Revisions Are First-Class

Presentation response must contain ordered segments, each bound to a stable plan_version_id.

Rules:

  1. No cross-segment waypoint-index matching.
  2. Segment boundaries are explicit by time and distance.
  3. Actual telemetry points belong to the active plan version at sample time.
  4. Replan transition markers are part of the payload.

5) Every Unit-Bearing Series Is Self-Describing

Each returned series must include:

  1. unit_system
  2. unit_label
  3. precision_hint

Frontend must render labels and formatting guidance from payload metadata, not local assumptions.

6) Snapshot Freshness Is Mandatory

To prevent stale updates overwriting fresh data, response must include monotonic freshness keys.

Required:

  1. snapshot_id
  2. generated_at

Frontend applies only newer snapshots.

7) Transport Decision: gRPC (Connect) Primary

For this product stack, the telemetry presentation endpoint will use gRPC over Connect as the primary transport.

Rationale:

  1. Current frontend already uses Connect/gRPC clients broadly for mission and telemetry domains.
  2. Strong schema and enum contracts fit unit-system metadata and segment semantics.
  3. Frequent update patterns and future push streaming are better aligned with gRPC than ad-hoc REST polling.
  4. Shared protobuf contracts improve multi-client parity (web/iOS/android).

REST position:

  1. Existing REST endpoints remain supported where they already exist.
  2. No new REST telemetry presentation endpoint is required for v1.
  3. If an external non-gRPC consumer appears, a REST facade can be added later without changing canonical backend processing.

Canonical Unit Policy

Canonical backend write units remain:

  1. Distance short/local: meters
  2. Distance long/route: kilometers
  3. Altitude: meters
  4. Speed: meters/second
  5. Acceleration: meters/second^2
  6. Angle/heading/orientation: degrees
  7. Temperature: Celsius
  8. Mass/payload: kilograms
  9. Battery percent: percent
  10. Battery voltage/current/capacity: V/A/Ah
  11. Duration: seconds

Presentation conversion target may be U.S. Customary where requested.

Contract (Normative Semantics)

Field names may change at protobuf/openapi layer; semantics below are mandatory.

Request

{
  "mission_id": "string",
  "from_time": "RFC3339 | optional",
  "to_time": "RFC3339 | optional",
  "max_points": 600,
  "client_profile": "CLIENT_PROFILE_WEB | CLIENT_PROFILE_MOBILE | optional",
  "viewport_width_px": 900,
  "device_pixel_ratio": 2,
  "series_mask": ["terrain", "planned", "actual"],
  "response_unit_system": "UNIT_SYSTEM_METRIC | UNIT_SYSTEM_US_CUSTOMARY | UNIT_SYSTEM_UNSPECIFIED",
  "include_segments": true,
  "since_snapshot_id": "string | optional"
}

Response

{
  "meta": {
    "snapshot_id": "string",
    "generated_at": "RFC3339",
    "effective_unit_system": "UNIT_SYSTEM_US_CUSTOMARY",
    "default_precision_hint": 2
  },
  "segments": [
    {
      "segment_id": "string",
      "plan_version_id": "string",
      "start_time": "RFC3339",
      "end_time": "RFC3339 | null",
      "plan_change_at": {
        "time": "RFC3339 | null",
        "distance": 1.82
      },
      "series_meta": {
        "distance": { "unit_system": "UNIT_SYSTEM_US_CUSTOMARY", "unit_label": "mi", "precision_hint": 2 },
        "altitude": { "unit_system": "UNIT_SYSTEM_US_CUSTOMARY", "unit_label": "ft", "precision_hint": 0 }
      },
      "points": [
        {
          "time": "RFC3339",
          "distance": 0.15,
          "terrain": 45.2,
          "planned": 140.0,
          "actual": 132.4,
          "contingency": null,
          "warning": null,
          "agl": 87.2,
          "msl": 177.6
        }
      ]
    }
  ]
}

Render-Only Delivery Model (Normative)

Goal: the client (web/mobile) receives only the data actually needed to render the current graph state.

“No Over-Delivery” Principle

The server must not send a raw telemetry stream when the client has requested a render-ready representation.

Mandatory rules:

  1. Points in the response are already downsampled for the current viewport.
  2. Points are bounded by max_points (or the server-side cap if max_points exceeds the allowed maximum).
  3. Only the requested series (series_mask) are included in the response.
  4. If no new snapshot has appeared, the server returns an empty delta with updated meta status rather than repeating the full array.

Parameters Affecting Response Volume

To adapt for web/mobile, the request must convey render context:

  1. client_profile:
  • CLIENT_PROFILE_WEB
  • CLIENT_PROFILE_MOBILE
  1. viewport_width_px
  2. device_pixel_ratio
  3. max_points
  4. series_mask:
  • terrain
  • planned
  • actual
  • contingency
  • warning

Server Behavior by Client Profile (v1 defaults)

  1. WEB profile:
  • target max_points: 600-1200 (depending on viewport_width_px)
  • default cadence: 1s (or 500ms during warning/contingency)
  1. MOBILE profile:
  • target max_points: 250-500
  • default cadence: 1s (or 500ms during warning/contingency), with a stricter payload budget

Payload Budget Targets (SLO)

  1. WEB:
  • p95 response size <= 180KB for full refresh
  • p95 delta size <= 40KB
  1. MOBILE:
  • p95 response size <= 80KB for full refresh
  • p95 delta size <= 20KB

Incremental Update Semantics

since_snapshot_id must return only changes relative to the specified snapshot:

  1. New points in active segments.
  2. New or updated segment markers (e.g., replan boundary).
  3. Updated meta (snapshot_id, generated_at).

If no changes have occurred:

  1. Response contains meta confirming current freshness.
  2. segments[].points may be empty.

Segment Delivery Model: Full Refresh vs. Delta

Screenshot 2026-02-26 at 12.07.35 AM.png

The graph must always display the complete mission — all plan versions from takeoff to now. The delivery model has two modes:

Full refresh (first call, or since_snapshot_id omitted):

  • Server returns ALL segments with ALL their (downsampled) points.
  • FE builds the complete graph state from scratch.
  • Payload budget: ≤ 180 KB web / ≤ 80 KB mobile (p95).
  • Used on: graph open, page reload, error recovery, fallback.

Delta update (subsequent polls with since_snapshot_id):

  • Server returns only segments that changed since the referenced snapshot.
  • Unchanged segments are not re-sent — the FE keeps them in local state.
  • Payload budget: ≤ 40 KB web / ≤ 20 KB mobile (p95).
  • Used on: every 1s poll during flight.

What a delta may contain:

  1. New points appended to the currently active segment.
  2. A closed segment (its end_time set) if a replan occurred.
  3. A brand-new segment (new plan_version_id) after a replan.
  4. Updated meta with a new snapshot_id.

FE merge rule: the frontend always holds ALL segments in memory (the full graph). Deltas are merged into this state — new points are appended, new segments are added, closed segments are updated. The result is that the FE always has the complete picture; only the wire payload is minimized.

Example polling flow:

1. FE opens graph → full refresh (no since_snapshot_id)
   Server → FE: { segments: [seg1(280pts), seg2(150pts)], snapshot: "snap-047" }
   FE state: [seg1 ■■■■■■■■ | seg2 ■■■■■]  ← full graph

2. 1s later → delta poll
   Server → FE: { segments: [seg2(3 new pts)], snapshot: "snap-048" }
   FE state: [seg1 ■■■■■■■■ | seg2 ■■■■■■]  ← 3 pts appended to seg2

3. 1s later → delta poll, nothing changed
   Server → FE: { segments: [], snapshot: "snap-048" }
   FE state: [seg1 ■■■■■■■■ | seg2 ■■■■■■]  ← unchanged

4. Replan happens → next delta poll
   Server → FE: { segments: [seg2(end_time set), seg3(new, 2pts)], snapshot: "snap-049" }
   FE state: [seg1 ■■■■■■■■ | seg2 ■■■■■■ | seg3 ■]  ← seg2 closed, seg3 started

Why This Matters

  1. A single backend contract serves web and mobile without duplicating logic.
  2. Network and client CPU scale with the actual screen size and UX scenario.
  3. Eliminates transmission of thousands of surplus points that have no impact on the rendered result.

Graph Update Cadence (Normative)

The frontend must not call the presentation endpoint at an unconstrained rate.

Default Polling Cadence (v1)

  1. Mission IN_PROGRESS or HOLD: call every 1s with since_snapshot_id.
  2. Mission CONTINGENCY or WARNING active: call every 500ms with since_snapshot_id.
  3. Mission ASSIGNED/READY (not flying): call every 10s.
  4. Mission COMPLETED/TERMINATED: no periodic polling; fetch once on open and on manual refresh.

Rationale

  1. 1s during standard flight keeps graph responsive while controlling backend QPS.
  2. 500ms during warning/contingency preserves operator visibility in critical transitions.
  3. Non-flying states do not require high-frequency refresh.

Frontend Scope Changes

Replace

  1. FE graph merge logic in useFlyGraphData for altitude route assembly.
  2. Worker merge dependency (useFullGraphData and getFullRouteData) for this endpoint.
  3. Graph unit-label dependence on global local-storage preference for server-series payload.

Keep

  1. Rendering components (Graph, tooltip, markers).
  2. Mission-state UI behavior (flying/contingency/warning visuals).
  3. SI write-path conversion helpers until mutation APIs are fully typed SI at form boundaries.

Add

  1. Telemetry presentation service client (services/telemetry/...).
  2. Adapter mapping server payload to existing graph prop model.
  3. Snapshot freshness guard in store update path.
  4. Feature-flag switch and rollback fallback.

Acceptance Criteria

Functional Correctness

  1. No UI regressions on high-density telemetry missions.
  2. Replan missions show correct segmented planned/actual traces.
  3. Cross-client numerical parity holds within agreed tolerances.
  4. Write APIs remain strict SI and pass existing mutation tests.
  5. Every unit-bearing series includes effective unit metadata.
  6. Stale snapshot responses never override fresher data in FE state.

Downsampling Fidelity (Gate Criteria)

  1. Point count is reduced by at least 70% on representative missions (baseline: ~3,200 → target: <960) without visually significant shape loss. This is the primary gate criterion.
  2. As a secondary validation: a 20-minute near-flat drift interval produces ≤4 points while preserving segment boundary correctness.

Performance Budgets

  1. Response payload stays within p95 budget targets: 180 KB web full / 40 KB web delta / 80 KB mobile full / 20 KB mobile delta.

Risk Register

  1. Backend CPU increase. Mitigation: bounded max_points, caching, pre-aggregation.
  2. Schema complexity. Mitigation: versioned contract + consumer-driven tests.
  3. Rollout regressions. Mitigation: feature flag + dual-path parity + replay.
  4. Partial replan semantics. Mitigation: block rollout until plan_version_id segmentation is fully implemented.

Consequences

Positive

  1. Single source of truth for graph data — Conversion formulas, downsampling, and plan segmentation live in one place (server). Bug fixes propagate to all clients immediately.
  2. Reduced FE complexity — The altitude graph frontend becomes a thin rendering client. useFlyGraphData pipeline (~800 LOC including worker code) can be retired.
  3. Better operator safety during replans — Plan-version segmentation eliminates a class of index-remap bugs that could show incorrect altitude traces during live flight.
  4. Smaller payloads — 70%+ reduction in points per response directly reduces bandwidth and parse time, especially on web.

Negative

  1. Backend team now owns graph-shape correctness — Bugs in downsampling or segment boundary logic require backend deployment to fix. FE cannot patch around them locally.
  2. New endpoint coupling — The FE altitude graph is now hard-dependent on the presentation endpoint. If the endpoint is down or slow, the graph is entirely unavailable (mitigated by dual-path fallback in Phase 1–2).
  3. Increased backend compute — Server-side RDP and unit conversion add CPU per request. Must be monitored and bounded (see Risk Register).
  4. Schema evolution overhead — Any change to graph series or segment semantics requires coordinated protobuf schema updates across FE and BE.

Alternatives Considered

  1. Patch existing FE pipeline only. Rejected: addresses performance partially but does not solve cross-platform formula duplication or replan segmentation. Was implemented as a stopgap (memoization and throttling in useFlyGraphData); this ADR replaces that approach. Estimated effort: ~2 weeks FE-only, but yields single-platform benefit and leaves consistency gap open.
  2. FE WASM math acceleration. Rejected: reduces compute time on the client but does not reduce payload volume (~450 KB still transferred). Does not address multi-client contract drift (iOS/Android would still need their own implementations). Estimated effort: ~6 weeks for WASM module + integration, with ongoing maintenance burden across platforms.
  3. Unit-aware writes (accept user-preferred units in mutation APIs). Rejected: introduces ambiguity in write paths — a mutation with altitude: 400 could mean meters or feet depending on a header/parameter. This is a flight-safety risk. All mutation APIs must remain unambiguous canonical SI.

Rollback Plan

  1. Disable server-series feature flag.
  2. Revert to legacy FE graph pipeline.
  3. Capture and replay failing snapshot payloads for root-cause.

Open Questions

Blocking for v1 (must resolve before Phase 1 implementation)

  1. Segment point model: pre-aligned terrain per point vs. separate terrain track with index mapping. Affects protobuf schema and downsampling pipeline design.
  2. Contingency/warning representation: separate series vs. state-tagged actual series. Affects series_mask semantics and FE adapter shape.

Deferred (can resolve during Phase 2 or later)

  1. Whether true push transport (SSE/WebSocket) is required beyond v1, or polling + since_snapshot_id is sufficient long-term.
  2. Final parity tolerances by metric (distance, AGL, MSL, boundaries) — to be derived from Phase 1 parity comparison data.
  3. Production tuning for epsilon_x_px / epsilon_y_px by viewport and device pixel ratio — to be refined during Phase 2 visual regression testing.

Links

[Uncrew Apollo Altitude Graph: Architecture and Dynamic Plan Update Strategy Overview](confluence-title://UE/Uncrew Apollo Altitude Graph: Architecture and Dynamic Plan Update Strategy Overview)

Cited by queries

Last updated on