JWKS Caching
Originally
ADR-0014-JWKS_caching (v6) · Source on Confluence ↗JWKS caching
Context
Constant calls to frontegg to request the JWKS formatted public key are slowing down the deliveries flow and breaking our SLA with Walmart (current SLA is < 1sec, but with all the calls to frontegg, we are currently averaging 5 seconds). Each service in the chain has to make 2 calls to frontegg - first to the configuration endpoint to get the JWKS endpoint url, then the actual call to the JWKS endpoint. Services currently cache the public key locally, but they only cache it for 1 minute, and only cache after the fact.
Decision
Implement an asynchronous cache warmer inside shared libraries so that each client can manage their own cache efficiently. The cache warmer will be responsible for asyncronously refreshing the local jwks cache on a timer so that the actual validation can happen with 0 external calls necessary.
Alternatives considered
Increase local service caching to one hour
The main alternative would be to set each service’s jwks cache to one hour instead of the current value of one minute. The drawback here is that we are still paying the lookup toll whenever the cache expires and if theres been a significant amount of time since the last delivery (ie - overnight) then every service in the chain will have to pay the lookup toll since their local cache will be expired
Create Centralized Caching Service
There is high overhead in writing and maintaining a separate service. It breaks our mantra of “run less software”
Consequences
common-middleware-golang will need its caching provider replaced with an async cache warmer.
nodejs solution will need to be created