GKE Version Alerting
ADR-0065 · Author: Sybil Melton · Date: 2025-02-07 · Products: platform
Originally
Originally
ADR-0008-GKE-VERSION-ALERTING (v6) · Source on Confluence ↗GKE Version Alerting
Context and problem statement
Platform Engineering must create a solution for alerting on GKE versions so they can ensure the availability of Platform services for customers.
Decision drivers
Actionable alerts
- Easily created
- Easily understood
- Easily delivered
- Actively work to minimize, reduce, or eliminate Alert Fatigue
Security
Accessible and easy-to-use GCP-native services
Maintainability
Extensibility
Decision Outcome
Platform Engineering will use a combination of Cloud Functions (Gen2), PubSub, and EventArc to send a message to Slack when an event is emitted by GKE that an UpgradeAvailableEvent exists. Platform Engineering shall be responsible for GKE maintenance and availability.

Consequences
- A paved road does not exist for Google Cloud Function (Gen2) which requires some research and effort.
- Alerting is step one of this journey. Actionable alerts are critically important. Reducing Alert Fatigue is critically important.
- The use of this approach can be extended to create visibility into active UpgradeEvents and Security Patch Bulletins for GKE.
- The use of this approach creates a path for extending event-driven architecture for Platform products in the future with a significant focus on the relationship of Cloud Functions, PubSub, and EventArc.
- The use of this approach potentially creates a path for creating a paved path for an opinionated Google Cloud Function (Gen2).
- Reacting to alerts may destabilize sprint priorities for Platform Engineering in favor of ensuring Platform availability for customers.
- Requires updating OPA to allow new resources
google_cloudfunctions2_function_iam_memberandgoogle_cloudfunctions2_function.
Links
Last updated on