Skip to content
GKE Version Alerting

GKE Version Alerting

Andi Lamprecht Andi Lamprecht ·· 2 min read· Accepted
ADR-0065 · Author: Sybil Melton · Date: 2025-02-07 · Products: platform
Originally ADR-0008-GKE-VERSION-ALERTING (v6) · Source on Confluence ↗

GKE Version Alerting

Context and problem statement

Platform Engineering must create a solution for alerting on GKE versions so they can ensure the availability of Platform services for customers.

Decision drivers

  • Actionable alerts

    • Easily created
    • Easily understood
    • Easily delivered
    • Actively work to minimize, reduce, or eliminate Alert Fatigue
  • Security

  • Accessible and easy-to-use GCP-native services

  • Maintainability

  • Extensibility

Decision Outcome

Platform Engineering will use a combination of Cloud Functions (Gen2), PubSub, and EventArc to send a message to Slack when an event is emitted by GKE that an UpgradeAvailableEvent exists. Platform Engineering shall be responsible for GKE maintenance and availability.

c9ee6163350bc1636f3bc35b827e1037-GKE_VERSION_ALERTING.png

Consequences

  • A paved road does not exist for Google Cloud Function (Gen2) which requires some research and effort.
  • Alerting is step one of this journey. Actionable alerts are critically important. Reducing Alert Fatigue is critically important.
  • The use of this approach can be extended to create visibility into active UpgradeEvents and Security Patch Bulletins for GKE.
  • The use of this approach creates a path for extending event-driven architecture for Platform products in the future with a significant focus on the relationship of Cloud Functions, PubSub, and EventArc.
  • The use of this approach potentially creates a path for creating a paved path for an opinionated Google Cloud Function (Gen2).
  • Reacting to alerts may destabilize sprint priorities for Platform Engineering in favor of ensuring Platform availability for customers.
  • Requires updating OPA to allow new resources google_cloudfunctions2_function_iam_member and google_cloudfunctions2_function.

Links

Last updated on