May 2026: Unification, Simplification, Stabilization
| Field | Value |
|---|---|
| Status | Draft |
| Owner | Andi Lamprecht |
| Contributors | Product & Engineering (org-wide); CEO sign-off required |
| Timebox | May 2026 (target finish 2–3 weeks earlier) |
| Audience | DroneUp Product & Engineering team; leadership |
Executive summary
DroneUp’s platform was designed 3–5 years ago around a single hyperscale drone-delivery future. The business has since consolidated around two product lines, Uncrew (the cloud ground-control system that operates drones) and ATOMx (airspace management), with shared services underneath. The platform has not.
This initiative reshapes the platform around that target, and uses the same work to dramatically stabilize the cloud software we already have. By the end of May, we expect the direction to be set, the first concrete pieces to be in dev, and the measurable improvements in stability, unification, and cost to be visible to the whole organization.
This PRD defines the target state, the outcomes we commit to, and the principles that every concrete design must satisfy. It intentionally does not prescribe specific implementations. The “how” is worked out by the teams, guided by the principles and informed by the reference material in the appendix.
Why now
- The business has settled on Uncrew + ATOMx. The platform must stop looking like a hyperscale delivery system and start looking like the business.
- Upcoming BD work requires supporting airframes beyond Auterion (e.g., Blueflite on PX4 / ArduPilot). The current vendor-locked, monolithic onboard architecture cannot do this.
- Approximately 20 ADRs have been decided but never implemented. Decision debt compounds on top of code debt.
- Recurring GCP spend is higher than the business requires. A scorecard of ~$8.9K/month in savings has already been identified.
- AI-powered development tooling is mature enough in 2026 to reshape how we build software, not just how we write code. The platform we invest in now should be built to be understood and evolved by AI agents as a first-class collaborator.
The target state
Two business lines on a shared platform
graph TB
subgraph UncrewRoles["Uncrew Roles"]
UR_STD["Standard: RPIC / Supervisor / Compliance / Hub Ops"]
UR_ADM["Admin: Fleet Admin / Tenant Admin"]
end
subgraph ATOMxRoles["ATOMx Roles"]
AR_STD["Standard: Airspace Manager"]
AR_ADM["Admin: Airspace Admin"]
end
subgraph Uncrew["Uncrew (Cloud GCS for drone operations)"]
UC["Uncrew applications"]
end
subgraph ATOMx["ATOMx (Airspace Management)"]
AX["ATOMx applications"]
end
subgraph Shared["Shared platform services"]
UTM["UTM / USS"]
GEO["Geodata"]
IAM["Identity, RBAC, ABAC"]
INV["Device Inventory & C2"]
OBS["Observability"]
NOTIF["Notifications"]
DATA["Shared data plane"]
end
UR_STD --> UC
UR_ADM --> UC
AR_STD --> AX
AR_ADM --> AX
UC --> Shared
AX --> Shared
What “target state” means, per dimension
| Dimension | Today | Target (end state, not timeline) |
|---|---|---|
| Product lines | Implicit, blurred across services | Two explicit lines (Uncrew and ATOMx) with a named, documented shared platform between them |
| Frontends | Multiple stacks (Uncrew, HubOps, UTM, Zebra) with overlapping concerns | Minimum viable set per business line, one design system, one auth, one core |
| Backend languages | Polyglot | Go for backend; Clojure retired; Python where necessary, React/TS/VITE for UI, TBD if Swift, Kotlin make sense over React Native |
| Geodata | Multiple overlapping services and pipelines | One canonical cache design used by cloud and onboard |
| Onboard software | C++ monolith, ~2 GB, Auterion-locked | Lightweight, hardware-agnostic, certifiable-by-design agent, ideally in Go |
| Device management | Split UA/DBX inventory services; Auterion Suite for onboarding; VPN debug | One unified inventory, identity, and command-and-control surface |
| Observability | Siloed; onboard telemetry as a sidecar | One OTel pipeline spanning cloud and onboard, opütimized for auditability and cellular throughput |
| Infrastructure | Many GKE projects, always-on workloads, 31 Cloud SQL dev instances | Consolidated clusters, demand-driven workloads where safe, shared dev data tier |
| Access control | Ad-hoc per service | Unified RBAC for deploy/prod access; RBAC + ABAC for application roles (pilot, supervisor, compliance, airspace-manager) |
| Architecture decisions | Scattered across Confluence, GitHub repos, service folders | Single canonical home, AI-native lifecycle, every accepted decision traceable to an implementation |
| Cloud code quality | Variable coverage, ad-hoc TODOs, implicit assumptions | Explicit requirements, ≥98% coverage, mutation-tested, zero open TODOs on in-scope services |
Primary outcomes
These are the measurable changes we commit to. Every workstream and design lands against one or more of these.
1. Unification
The platform is visibly organized around two business lines with a shared foundation. A new engineer (or AI agent) can answer “what does Uncrew own? what does ATOMx own? what is shared?” by reading one diagram and one inventory.
2. Simplification
We end the month with measurably fewer moving parts than we started: fewer services, fewer frontends, fewer GCP projects, fewer data stores, fewer Pub/Sub topics. Adding a new airframe, tenant, or application role does not require a code fork. Where a mature standard component exists, we use it rather than maintaining a custom one.
3. Stabilization
Our cloud software becomes boring in the good sense: predictable, well-specified, fully tested, and trustworthy.
- Explicit requirements for every in-scope service. Where requirements don’t exist, we use AI to derive them from the existing code and commit them as first-class artifacts.
- Explicit verifications mapped 1:1 to those requirements, so every requirement has at least one test proving it.
- ≥98% test coverage across the in-scope services.
- Mutation testing baseline established so that coverage reflects meaningful tests, not lines touched.
- Zero open TODOs in the in-scope services. Every TODO is either implemented or converted into a tracked issue with an owner.
- Unproven assumptions audited and resolved. Comments like “assumes X”, silent fallbacks, and defensive-code-for-impossible-states are either validated by tests or removed.
- Bug-hotspot analysis. The services with the highest historical defect rate get the most stabilization attention.
- Regular-Automated-AI-Assisted-QA We need to put our product through the wringer. Almost all day and every day. Simple things can be tackled right away by AI. Never autonomous, always human-gated.
4. Cost optimization
The existing GCP Cost Optimization milestone is the scorecard. Current picture: ~$1,960/month banked, ~$6,898/month identified in 21 open items, combined target $8,858/month ($106K/year). We commit to ≥80% of open items closed or explicitly marked “won’t do” with rationale by 2026-05-31.
5. Demonstrability
The unified platform produces one scripted, rehearsable golden-path demo that exercises the full Uncrew operational flow end-to-end across the shared platform services. The same artifact serves two audiences:
- Business Development. A presentable, self-contained demo that does not require engineers to hand-hold. Usable in government and commercial partner conversations without a rebuild and product/engineering involvement each time.
- Verification. The same script runs as an integration smoke harness on every trunk merge. It catches the class of cross-service regressions that unit tests and mutation testing cannot — the failures that emerge only when unification is partially done.
Concretely, the golden path covers at minimum for a fleet of 5 drones: RPIC authentication → mission plan creation → UTM authorization → takeoff → live telemetry → mid-mission contingency (CLZ redirect or RTL) → landing → audit artifact retrieval. Every shared platform service in the target-state diagram is touched at least once.
Any breaking change to a shared service that fails the golden-path run blocks the merge. This is the integration contract that replaces hope.
6. Process modernization (AI-native by default)
Architectural decisions, requirements, and documentation are produced, reviewed, and maintained in an AI-native workflow. A single canonical home for ADRs replaces the current sprawl across Confluence and multiple repos. An “Accepted” ADR without a linked implementation is no longer a valid terminal state. Requirments management becomse fully integrated into the AI-powered development workflow.
Design principles
Every concrete design shipped under this initiative is measured against these. A design that fails one does not ship under this umbrella.
- Optimize for 2026+ AI-powered development workflows. Code, configuration, documentation, and tests should be legible and actionable by AI agents with minimal priming: one canonical source per concern, machine-readable structure, small focused modules, stable public contracts.
- If we can simplify, we must. Removing a service, a pipeline, a repo, or a database is always preferable to maintaining it.
- If we can unify, we must. Prefer one shared implementation over N near-duplicates.
- Standard industry components beat custom ones. Custom implementations must earn their place by solving a problem the standard option genuinely cannot.
- Hardware- and vendor-agnostic where the business allows. Abstraction layers are mandatory for anything that may span platforms or vendors.
- Certifiable by design. New safety-relevant software is structured from day one to produce DO-178C / DO-278A lifecycle artifacts fullfilling DAL-D objectives directly, rather than retrofitted later.
- Requirements before code; verifications before “done”. Every in-scope service has explicit requirements and verifications, even if AI-backfilled. “I tested it by running it” is no longer an acceptable statement of verification.
Constraints
- Part 135 certificate: production infrastructure is under configuration control. All May work targets dev and sandbox environments. Any production change routes through the DCM Change Request process afterwards.
- Regulatory continuity. Any work touching UTM maintains ASTM F3548.v21 compliance and passes the InterUSS qualifier suite.
- No cloud or vendor migration. We remain on GCP.
Success criteria
Unification & Simplification
- Target-state architecture diagram accepted.
- A documented inventory of shared-vs-business-line ownership is published in this repo.
- Measurable reductions published at month end: counts of services retired, frontends merged, GCP projects deprecated, Pub/Sub topics removed, data stores consolidated.
Stabilization (cloud software)
- For every in-scope service: explicit requirements register and verification matrix, kept in md-files or gh issues for now until we migrated to a new system for managing those artifacts.
- Test coverage ≥98% across in-scope services.
- Mutation testing baseline established; mutation score reported per service.
- Zero open TODOs in in-scope services (implemented or converted to tracked issues).
- Bug-hotspot analysis published; top-N hotspots have stabilization work scheduled.
Demonstrability
- One scripted golden-path demo runs green in sandbox on every trunk merge.
- The demo covers every shared platform service in the target-state diagram at least once.
- The demo is presentable to a non-technical audience without live engineer or product narration, with a documented script and reset procedure.
- A breaking change to a shared service that fails the demo blocks merge.
Cost
- ≥80% of milestone #1 open items closed, or explicit “won’t do” with rationale. GCP monthly bill goes down to below $23k per month.
Process
- Canonical ADR location live in this repo; accepted-and-active ADRs migrated; legacy sources reduced to pointers.
- AI-native ADR process document published and referenced from CONTRIBUTING.
- Dashboard surfaces “Accepted-but-not-Implemented” ADRs as a visible, triage-able list.
- Process for managing requirements, verification results and changesets thereof is established and documented.
- Boards (CCB, ARB, etc) run with AI-assisted tooling, e.g. notetaker, task-master, librarian
Proposal rating scheme
Every proposal under this umbrella is scored on the same six dimensions as our Primary outcomes. Same names, same order — no translation layer between what we committed to and how we rank work. Proposers self-score; the review panel re-scores at intake and the panel’s numbers are authoritative.
Each dimension is scored 0–3. The total is the sum (max 18).
| Score | Meaning |
|---|---|
| 0 | No effect (or regression) |
| 1 | Local — one service, one team, minor cleanup |
| 2 | Meaningful — measurable gain across several services or a major surface |
| 3 | Structural — retires a subsystem, eliminates a class of problem, or sets a platform-wide baseline |
Dimensions
| Dimension | What it asks |
|---|---|
| Unification (U) | Does this collapse N implementations, stacks, or surfaces into one shared one? |
| Simplification (S) | Does this reduce the count of moving parts (services, frontends, projects, topics, stores)? |
| Stabilization (R) | Does this move an in-scope service toward the stabilization baseline — explicit requirements, ≥98% coverage, mutation-tested, zero-TODO, assumptions resolved? |
| Cost (C) | Verifiable recurring savings. 1 = <$1K/mo, 2 = $1–3K/mo, 3 = >$3K/mo or structurally unblocks future savings. Must cite a line item in milestone #1 or an equivalent documented estimate. |
| Demonstrability (D) | Does this extend, harden, or unblock the golden-path demo — so a breaking change to a shared service is caught on merge and the demo is presentable without engineer hand-holding? |
| Modernization (M) | Does this advance the AI-native workflow — canonical ADRs, machine-readable requirements/verifications, AI-legible code and contracts, AI-assisted board tooling? |
Gate: evidence before scoring
A proposal must cite the evidence that justifies its score on any non-zero dimension — the incident, BD conversation, on-call page, Jira cluster, ADR, or cost line item. No citation, no score on that dimension. “Feels more stable” and “seems simpler” don’t count.
If U, S, R, C, D, and M are all 0, the proposal does not belong under this umbrella and should be routed to the regular roadmap.
Score what the PRD will deliver by the end of its timebox, not the aspirational end state.
Ranked buckets
| Total | Bucket | Default action |
|---|---|---|
| 14–18 | Must-do | Land under this initiative; owner assigned within the week |
| 10–13 | Should-do | Land if capacity allows; defer competing lower-bucket items first |
| 6–9 | Nice-to-have | Pick up opportunistically or bundle with a higher-ranked PRD |
| 3–5 | Defer | Park with a clear revisit date; not prioritized this cycle |
| 0–2 | Reject / re-scope | Does not justify the coordination cost |
Ties are broken by the dimension the proposal leads on, in outcome order: U > S > R > C > D > M.
Worked example — frontend consolidation
Reduce Uncrew + HubOps + UTM admin + Zebra shells to a single Uncrew shell and a single ATOMx shell.
- U = 3 — collapses 4 frontends into 2 across both business lines.
- S = 3 — retires 2 Next.js apps, their CI/CD, and their auth integrations.
- R = 2 — one design system and one auth integration removes a recurring class of session-handling bugs.
- C = 2 — ~$600/mo hosting + build minutes plus ~$2K/mo engineer-time equivalent.
- D = 2 — golden-path demo gains a single frontend to drive; reset script simplified.
- M = 1 — one AI-legible component library replaces four divergent ones.
Total: 13 → Should-do.
Scoring table (append new proposals here)
Scores below are intentionally blank — they will be filled in from the review-panel survey. Each proposal is linked to its draft PRD under 0002-proposals and to its originating PR.
| Proposal | Author | Draft | PR | U | S | R | C | D | M | Total | Bucket |
|---|---|---|---|---|---|---|---|---|---|---|---|
| (example: Frontend consolidation) | — | — | — | 3 | 3 | 2 | 2 | 2 | 1 | 13 | Should-do |
| Metis — AI-Native Delivery Platform | John Vernon | draft | (new) | ||||||||
| Argus — OTel-Native Onboard Observability | TBD | draft | (was PRD-0001) | ||||||||
| Health Monitoring & On-Call Management | TBD | draft | (new) | ||||||||
| Light-Physics Drone Simulation (Uncrew) | Remek Zajac | draft | #307 | ||||||||
| Onboard Software Deployment via Inventory Service (Uncrew) | Ihor Prozhoha | draft | #320 | ||||||||
| AOI Management Service (Atlas) | Szymon Sikora | draft | #321 | ||||||||
| Themis API Go Rewrite (UTM) | Eric Gesell | draft | #322 | ||||||||
| Airspace Management — NFZ Creation & CLZ Management (Uncrew) | Ihor Prozhoha | draft | #324 | ||||||||
| Modular Screen Architecture for uncrew-apollo-frontend (ADR) | Oleksii Naboichenko | draft | #325 |