Authenticating Consumers
Introduction
Envoy AI Gateway authenticates every inference request at the edge and propagates the caller's identity to downstream policies. Authentication is configured with the Envoy Gateway SecurityPolicy resource, which attaches to the HTTPRoute generated by an AIGatewayRoute. After the caller is identified, selected claims are copied into request headers that token quotas and usage metering consume as the per-tenant key.
This turns a per-consumer credential, such as an SSO (Single Sign-On) token or an API key, into an identity that quota and metering can act on. It is the foundation of multi-tenant model serving on a shared gateway.
Use Cases
- A developer obtains a JWT (JSON Web Token) from the platform identity provider and calls the gateway with it, so the gateway enforces per-user token quotas.
- A CI job presents a service-account token so that automated traffic is attributed to a team rather than an individual.
- A machine consumer that cannot run an interactive login presents a static API key that maps to a known tenant.
Prerequisites
- Envoy AI Gateway is installed. See Install Envoy AI Gateway.
- An
AIGatewayRoutealready routes requests to one or more backends. - For the OIDC/JWT path: an OIDC issuer with a reachable JWKS endpoint. The platform's built-in identity provider, Dex, is the default; any other OIDC issuer (Keycloak, Auth0, Okta, GitHub OIDC, an enterprise Entra ID tenant) also works as long as the gateway can reach its
/.well-known/openid-configurationand JWKS URL. - For the API-key path: cluster permission to create
Secretobjects in the gateway's namespace.
Create the Gateway and AIGatewayRoute in a dedicated namespace (for example maas-system), not in the Envoy Gateway control-plane namespace envoy-gateway-system. A gateway placed in the control-plane namespace may not have the AI Gateway request-processing filter and SecurityPolicy applied to its listener, which silently breaks routing and policy enforcement. See Envoy AI Gateway.
Steps
Authenticate with OIDC or JWT
Validate tokens issued by an OIDC issuer. The platform's built-in Dex is the default issuer; it can also broker external identity sources, such as LDAP or another OIDC provider, so their users obtain platform tokens. Those connectors are configured in platform IdP (Identity Provider) management. For platform IdP configuration, see Identity Providers.
Any OIDC issuer with a reachable JWKS endpoint can be used. Replace the issuer and remoteJWKS.uri below with the issuer of your choice when consumers are not platform users — for example, an enterprise Keycloak realm or a SaaS IdP — so the gateway accepts their tokens without requiring a platform account.
Point the gateway at the OIDC issuer and map its claims to identity headers:
<platform-address>: the platform access address. Dex publishes its issuer at/dexand its JWKS at/dex/keys.<aigatewayroute-name>: the name of theHTTPRoutegenerated by yourAIGatewayRoute.audiences: the token audience(s) the gateway accepts. On a shared IdP, omitting this accepts any valid token from the same issuer — including tokens minted for other clients — which would still resolve to anx-user-idand consume quota. Set it to the client ID(s) the gateway's tokens are issued for; if your issuer does not set a distinguishingaud, register a dedicated client for the gateway.claimToHeaders: the bridge between identity and policy. The emitted headers (x-user-id,x-user-group,x-user-namespace) become the selector keys for token quotas and the label values for usage metering and chargeback.
claimToHeaders only supports scalar claims (string, int, double, bool); array-typed claims are not supported and will not populate the header. The standard OIDC groups claim is usually an array — to use it as x-user-group/department, expose a single-valued claim from the IdP connector (for example a primary-group or a dedicated department claim) and map that. If x-user-group stays empty, per-department metering, quotas, and tiers silently fall back to no grouping.
namespace is not a standard OIDC claim: it must be added in the upstream IdP connector and is absent by default. x-user-namespace is the per-namespace chargeback key consumed by Metering Token Usage; map it only when billing by namespace or tenant. To key policies on any other attribute the platform does not emit by default, such as a subscription tier, add the claim in the connector and map it with an extra claimToHeaders entry.
To roll out without blocking traffic, set jwt.optional: true first and observe. Remove it once all consumers present valid tokens.
Authenticate with an API key
If the Gateway is created in the Envoy Gateway control-plane namespace envoy-gateway-system, apiKeyAuth (like model routing) is silently not enforced: the SecurityPolicy reports Accepted=True, but a wrong or missing key still returns 200 and no x-user-id is injected. This is the same control-plane listener-skip issue described in the prerequisites note above — not an Envoy Gateway version bug. The fix is to create the Gateway and AIGatewayRoute in a dedicated namespace (for example maas-system), where apiKeyAuth enforces natively with no patch. Verify with a single no-key request: a dedicated-namespace gateway returns 401. If the gateway must stay in envoy-gateway-system, see the supported remedies at the end of this section.
For machine consumers that cannot perform an OIDC flow, validate a static API key instead. There is no issuance service: the cluster administrator generates a random string per consumer, stores it in a Secret, and shares it out of band. The gateway's data plane validates each request by looking the presented value up in that Secret.
Generate one key per consumer and store them in a single Opaque Secret. Each data-map key is the client identifier that downstream policies see; each value is the API key the consumer presents:
Bind the Secret to the route with a SecurityPolicy:
credentialRefs: one or more OpaqueSecrets holding the credentials. Each data-map key is the client identifier, each value is the literal API key. Adding a consumer is akubectl patchof one entry; revoking is a single key deletion.extractFrom: where Envoy reads the presented key from. The filter does a literal-string compare, so prefer a dedicated header such asX-API-Key. ReusingAuthorizationrequires storing the value with itsBearerprefix, which mixes badly with the OIDC path on the same gateway.forwardClientIDHeader: the header that carries the matched client identifier to the upstream and to later filters. Use the same name as the OIDCclaimToHeaderstarget (x-user-id) so token quotas and usage metering see one consistent key across both auth paths.sanitize: prevents the raw API key from leaking to the model backend or being logged downstream.
If the gateway must stay in envoy-gateway-system, apiKeyAuth enforcement is defeated by the same control-plane listener-skip described above, and no per-route SecurityPolicy change fixes it. The supported remedies are to move the Gateway and AIGatewayRoute to a dedicated namespace (preferred), or to upgrade Envoy AI Gateway to a release that narrows the listener-skip (v0.6.0 / Alauda release-0.6.0-alauda). A hand-rolled EnvoyPatchPolicy that edits the listener filter chain may serve as a temporary stopgap, but it is version-specific and fragile — it depends on the exact filter layout of the running Envoy build — and is not recommended for production.
Verification
Confirm the policy is accepted. SecurityPolicy status is ancestor-scoped, so the jsonpath looks one level deeper than for most resources:
The command returns True when the policy is programmed.
For the OIDC path, send a request with a valid token and confirm the upstream service receives the x-user-id, x-user-group, and x-user-email headers.
For the API-key path, send the matching X-API-Key and confirm the upstream sees x-user-id set to the matched client identifier:
A wrong or missing key returns 401 Unauthorized from the gateway before the request reaches any backend.
Learn More
Next Steps
After identity headers are propagated, configure Configuring Token Quotas to enforce per-tenant token budgets.