AOI Management Service (Atlas)
| Field | Value |
|---|---|
| Status | Draft |
| Owner | Szymon Sikora |
| Contributors | TBD |
| Date | 2026-04-20 |
PRD: AOI Management Service — Demand-Driven Data Acquisition
1. Executive Summary
Problem Statement: The Atlas pipeline has no geographic scoping mechanism, creating four compounding problems for the data ops team:
- Requesting new data is a manual engineering process — adding coverage for a new area requires a code change and a PR, introducing days of lead time for what should be a self-service operation.
- No visibility into what data we have — there is no inventory of which geographic areas are currently covered, making it hard to answer “do we have data for X?” without querying raw tables.
- Data readiness is opaque — knowing whether a pipeline has finished processing for a given area requires direct Airflow monitoring and knowledge of which DAGs to watch.
- OSM updates are all-or-nothing and risky — refreshing obstacle data updates every site simultaneously, which can silently introduce new obstacles into already-validated areas or remove existing ones, with no way to scope the update to only areas that need it.
Proposed Solution: A new Go microservice and map-based admin UI enabling internal operators to define Areas of Interest that constrain both scheduled and on-demand (OSM refresh) pipeline acquisition, with per-pipeline status visibility.
Success Criteria:
- All active pipeline DAGs acquire data only within defined AOI geometries
- Data refresh for a specific AOI runs in isolation — does not affect other AOI pipeline runs
- Zero data loss: AOI deletion is soft-delete only
Scope Clarity: A — requirements clear, stakeholders aligned.
2. User Experience & Functionality
User Personas:
- Data Ops Analyst — defines AOIs for new operational areas; requests OSM data refreshes
- Data Engineer — monitors per-pipeline status; investigates failures
- Atlas Pipeline (Airflow) — machine consumer of the API
User Stories:
As a Data Ops Analyst, I want to draw a polygon on a map or upload a GeoJSON file to define an AOI.
- Map renders with polygon/rectangle drawing tools
- File upload accepts
.geojson/.json; invalid GeoJSON rejected with a descriptive error before submit - Drawn/uploaded geometry rendered on map for confirmation before save
- AOI created with derived status
pending, empty pipeline list
As a Data Ops Analyst, I want to see AOI processing status broken down by individual pipeline.
- AOI detail view lists all pipelines: name and status (
pending | processing | success | failed), plus error message if failed - Derived AOI status: all pipelines
success→ready; anyfailed→failed; anyprocessing→processing; none started →pending - UI refreshes without full page reload (polling or SSE)
- AOI detail view lists all pipelines: name and status (
As a Data Ops Analyst, I want to request a data refresh for a specific AOI so updated OSM data is loaded only for that area.
- “Refresh Data” action available on AOIs whose latest run status is
readyorfailed; returns409if latest run is stillprocessing - AOI service generates a new
run_idUUID, creates anaoi_runsrecord, then calls Airflow withconf: {aoi_id, run_id} - New run starts with all pipelines
pending; previous run’s records are retained for history - Refresh run does not affect other AOIs’ pipeline runs
- “Refresh Data” action available on AOIs whose latest run status is
As a Data Ops Analyst, I want pipeline processing to start automatically when I create an AOI so I don’t have to wait for the next scheduled run.
- On
POST /areas-of-interest, the AOI service calls Airflow REST API immediately after persisting the record - AOI status transitions to
processingonce Airflow confirms the DAG run was accepted - If Airflow trigger fails, AOI remains
pendingand an error is surfaced in the UI; user can retry via “Refresh Data” - Trigger uses the same Airflow REST API call as the refresh flow:
POST /api/v1/dags/{osm_dag_id}/dagRunswithconf: {aoi_id, run_id}
- On
As a Data Ops Analyst, I want to update an AOI’s name or geometry.
- Name update does not affect pipeline status
- Geometry update resets all pipeline records to
pending(fresh run on next schedule)
As a Data Ops Analyst, I want to soft-delete an AOI.
- Sets
deleted_at; record not removed from database - Deleted AOIs excluded from pipeline queries
- Visible in UI with “deleted” indicator
- Sets
As an Airflow DAG, I want to query all active AOIs on schedule.
GET /areas-of-interestreturns non-deleted AOIs with GeoJSON geometry- Response ≤ 200ms for up to 500 AOIs
As an Airflow DAG, I want to report per-pipeline status back to the AOI service so operators can track readiness.
PATCH /areas-of-interest/{id}/runs/{run_id}/pipelines/{pipeline_name}accepts{status: "processing"|"success"|"failed", error?: string}run_idis the UUID generated by the AOI service at trigger time and passed to Airflow viaconf- Endpoint is idempotent — re-sending the same status is a no-op
- AOI service creates the pipeline record on first callback if it does not yet exist
- Valid transitions:
pending → processing → success,pending → processing → failed; invalid transitions return409 Conflict - Readiness for a run available at
GET /areas-of-interest/{id}/runs/{run_id}/readiness→{ready: bool, details: {pipeline_name: "success"|"failed"|"pending"|"processing"}}
Non-Goals:
- No overlap/conflict resolution between AOIs — pipeline handles union
- No customer-facing access
- No RBAC beyond Okta authentication (v1)
- No geometry history / versioning (v1)
- Gold table writers out of scope — obstacle/OSM pipelines only
Example UI
3. Regulatory & Compliance
No direct FAA or DO-178C applicability — internal data management tool, not flight-critical.
- Soft-delete preserves audit trail for downstream regulated systems
- Okta SSO required; no anonymous access
- OSM data is public; no ITAR/PII concerns
- All data stored in existing GCP project; no new data residency concerns
4. Technical Specifications
Architecture:
flowchart LR
UI["Admin UI\n(React)"]
Okta["Okta OIDC"]
SVC["AOI Service\n(Go)"]
DB["PostgreSQL\n+ PostGIS"]
AF["Airflow\n(Atlas Pipeline)"]
UI -->|"OIDC auth"| Okta
UI -->|"CRUD + refresh"| SVC
SVC -->|"persist AOIs + runs"| DB
SVC -->|"trigger DAG run\nconf: {aoi_id, run_id}"| AF
AF -->|"query active AOIs"| SVC
AF -->|"PATCH runs/{run_id}/pipelines/{name}"| SVC
Trigger & callback flow:
sequenceDiagram
actor Operator
participant UI as Admin UI
participant SVC as AOI Service
participant DB as PostgreSQL
participant AF as Airflow
Operator->>UI: Create AOI / Request Refresh
UI->>SVC: POST /areas-of-interest
SVC->>DB: persist AOI + generate run_id + create aoi_runs record
SVC->>AF: POST /api/v1/dags/{dag}/dagRuns\nconf: {aoi_id, run_id}
AF-->>SVC: 200 OK
SVC-->>UI: 201 Created {aoi, run_id}
loop per pipeline
AF->>SVC: PATCH /{id}/runs/{run_id}/pipelines/{name}\n{status: processing|success|failed}
SVC->>DB: upsert aoi_run_pipelines
SVC-->>AF: 200 OK
end
Operator->>UI: Poll readiness
UI->>SVC: GET /{id}/runs/{run_id}/readiness
SVC->>DB: aggregate pipeline statuses
SVC-->>UI: {ready: true, details: {...}}
Affected Repos:
| Repo | Language | Change |
|---|---|---|
droneup/dataanalytics-atlas-pipeline | Python/Airflow | Add AOI API queries to download tasks + per-pipeline status callbacks |
droneup/dataanalytics-atlas-aoi-service (new) | Go | New microservice |
droneup/dataanalytics-atlas-aoi-ui (new) | React/TypeScript | New admin UI |
Data Model:
CREATE TABLE areas_of_interest (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
geometry GEOMETRY(GEOMETRY, 4326) NOT NULL, -- PostGIS WGS84
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ
);
CREATE INDEX ON areas_of_interest USING GIST (geometry);
CREATE INDEX ON areas_of_interest (deleted_at) WHERE deleted_at IS NULL;
CREATE TABLE aoi_runs (
id UUID PRIMARY KEY, -- generated by AOI service, passed to Airflow as run_id
aoi_id UUID NOT NULL REFERENCES areas_of_interest(id),
triggered_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON aoi_runs (aoi_id, triggered_at DESC);
CREATE TABLE aoi_run_pipelines (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
run_id UUID NOT NULL REFERENCES aoi_runs(id),
pipeline_name TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending', 'processing', 'success', 'failed')),
error TEXT,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (run_id, pipeline_name)
);Derived statuses are computed in the service layer (not stored):
- Run readiness: all pipelines
success→ready: true; otherwiseready: false - AOI status (shown on list/detail): derived from the latest
aoi_runsrecord for that AOI
API:
POST /areas-of-interest → 201 includes run_id [Okta]
GET /areas-of-interest → 200 [Okta + service account]
GET /areas-of-interest/{id} → 200 includes latest run_id + derived status [Okta]
PATCH /areas-of-interest/{id} → 200 [Okta]
DELETE /areas-of-interest/{id} → 204 soft delete [Okta]
POST /areas-of-interest/{id}/refresh → 202 includes new run_id [Okta]
PATCH /areas-of-interest/{id}/runs/{run_id}/pipelines/{name} → 200 idempotent [service account]
GET /areas-of-interest/{id}/runs/{run_id}/readiness → 200 {ready, details} [Okta + service account]Trigger flow (shared by create and refresh):
- Generate
run_idUUID; persistaoi_runsrecord - Call Airflow REST API:
POST /api/v1/dags/{osm_dag_id}/dagRunswithconf: {aoi_id, run_id} - On Airflow failure: record is retained with no pipeline entries;
run_idreturned so client can poll or retry - Return
run_idin response body
Create flow (POST /areas-of-interest):
- Validate and persist AOI record
- Execute trigger flow above
- Return
201 Createdwith AOI +run_id
Refresh flow (POST /{id}/refresh):
- Return
409if latest run for this AOI isprocessing - Execute trigger flow above
- Return
202 Acceptedwith newrun_id
Auth:
- Admin UI: Okta OIDC (authorization code flow)
- Airflow → AOI service callbacks: bearer token stored in GCP Secret Manager, rotated on schedule
- GeoJSON validated server-side (
ST_IsValid) before PostGIS insert; reject with422on invalid geometry - No PII stored
5. Risks & Phased Rollout
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Airflow REST API not reachable from AOI service in-cluster | Medium | High | Verify network policy with platform team before design finalization |
| Geometry update mid-pipeline run resets in-flight records | Low | Medium | In-flight Airflow runs use a snapshot of AOI; reset only affects next pickup |
Phased Rollout:
- MVP: CRUD API + pipeline status callbacks + pipeline integration in
dataanalytics-atlas-pipeline. Admin UI — map draw, GeoJSON upload, status display, refresh button. Okta SSO. - v2.0: RBAC, per-source resolution tiers, AOI templates.
Dependencies:
- Okta application registration for
dataanalytics-atlas-aoi-ui - Airflow REST API reachable from K8s cluster (verify with platform team)
- New repo provisioning:
dataanalytics-atlas-aoi-service,dataanalytics-atlas-aoi-ui
6. Estimation Input
prd_sizing_input:
feature: "AOI Management Service — demand-driven data acquisition"
scope_clarity: "A"
key_terms:
- "area of interest"
- "geojson"
- "osm obstacle pipeline"
- "pipeline status"
- "data refresh"
risk_flags:
- "new-service"
- "geospatial"
- "okta-integration"
- "airflow-integration"
affected_repos:
- "dataanalytics-atlas-pipeline"
- "dataanalytics-atlas-aoi-service"
- "dataanalytics-atlas-aoi-ui"
domains:
- "backend-go"
- "frontend-react"
- "data-pipeline-python"
- "infrastructure-gcp"
regulatory: false
discovery_needed: false