Skip to content

CORS-4470: add GCP Workload Identity Federation support#10610

Draft
rochacbruno wants to merge 5 commits into
openshift:mainfrom
rochacbruno:feat/CORS-4470
Draft

CORS-4470: add GCP Workload Identity Federation support#10610
rochacbruno wants to merge 5 commits into
openshift:mainfrom
rochacbruno:feat/CORS-4470

Conversation

@rochacbruno

@rochacbruno rochacbruno commented Jun 8, 2026

Copy link
Copy Markdown
Member

Summary

  • Add workloadIdentityFederation field to the GCP platform install-config with optional poolID and providerID fields
  • Support two modes: BYO (customer provides existing WIF pool/provider) and installer-provisioned (installer creates WIF pool, OIDC GCS bucket, OIDC provider, and SA bindings)
  • Generate external_account credential manifests for STS token exchange instead of static service account keys
  • Auto-default CredentialsMode to Manual when WIF is configured
  • Add WIF resource teardown in the destroy flow (skips BYO resources)

Fixes: https://issues.redhat.com/browse/CORS-4470

Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added support for Google Cloud Workload Identity Federation (WIF) in both BYO and installer-provisioned modes.
    • Installer can provision and later clean up WIF pools, OIDC providers, discovery bucket, and bindings.
    • Cloud credentials secret can now include an external-account (WIF) JSON and uses a dedicated GCP credentials secret name when enabled.
  • Behavioral

    • Install validation and defaults now enforce manual credential mode when WIF is configured.
  • Tests

    • Added validation and unit tests for WIF configuration and utilities.

Add support for short-lived token authentication via GCP Workload
Identity Federation (WIF), replacing static service account keys.

Two modes are supported:

- BYO: customer provides existing poolID and providerID in the
  install-config. The installer describes the provider via the GCP
  IAM API to extract the OIDC issuer URL, validates it, and generates
  external_account credential manifests.

- Installer-provisioned: customer sets workloadIdentityFederation: {}
  with empty fields. The installer creates a WIF pool, GCS bucket for
  OIDC discovery, OIDC provider, and service account bindings using
  deterministic names derived from infraID.

CredentialsMode auto-defaults to Manual when WIF is configured.
Destroy flow cleans up installer-provisioned WIF resources but leaves
BYO resources untouched.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 8, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@rochacbruno: This pull request references CORS-4470 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Add workloadIdentityFederation field to the GCP platform install-config with optional poolID and providerID fields
  • Support two modes: BYO (customer provides existing WIF pool/provider) and installer-provisioned (installer creates WIF pool, OIDC GCS bucket, OIDC provider, and SA bindings)
  • Generate external_account credential manifests for STS token exchange instead of static service account keys
  • Auto-default CredentialsMode to Manual when WIF is configured
  • Add WIF resource teardown in the destroy flow (skips BYO resources)

Fixes: https://issues.redhat.com/browse/CORS-4470

Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 44a3f081-5a1e-414d-9dd1-a57fe5244a35

📥 Commits

Reviewing files that changed from the base of the PR and between 5ef9e1a and ec29fb7.

📒 Files selected for processing (3)
  • pkg/asset/manifests/openshift.go
  • pkg/infrastructure/gcp/clusterapi/wif.go
  • pkg/infrastructure/gcp/clusterapi/wif_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/infrastructure/gcp/clusterapi/wif_test.go
  • pkg/asset/manifests/openshift.go
  • pkg/infrastructure/gcp/clusterapi/wif.go

Walkthrough

This PR adds complete Workload Identity Federation (WIF) support to OpenShift's GCP installer. It defines new configuration types, validates WIF provider setup, generates external-account credentials during installation, provisions WIF infrastructure (pool, OIDC issuer, service account bindings), and cleans up resources on cluster destruction.

Changes

GCP Workload Identity Federation

Layer / File(s) Summary
Type System & Configuration Validation
pkg/types/gcp/platform.go, pkg/types/gcp/metadata.go, pkg/types/gcp/validation/*, pkg/types/defaults/installconfig.go
WorkloadIdentityFederation struct with PoolID and ProviderID fields; Platform and Metadata gain WIF pointer fields; IsWIFEnabled() and IsWIFBYO() helper methods added; validation enforces paired field presence, resource ID format (4–32 chars, starts with letter), and manual credentials mode; defaults set manual mode when WIF enabled.
Install-Config Client & Provider Validation
pkg/asset/installconfig/gcp/client.go, pkg/asset/installconfig/gcp/mock/gcpclient_generated.go, pkg/asset/installconfig/gcp/validation.go, pkg/asset/cluster/gcp/gcp.go
GetWIFProvider() added to client API to retrieve IAM Workload Identity Pool Provider details; mock wired via GoMock; validation checks BYO provider existence, ACTIVE state, and non-empty OIDC issuer URI; cluster metadata now carries WIF config; credential mode validation rejects non-manual modes when WIF enabled.
Cloud Credentials Manifest Generation
pkg/asset/manifests/template.go, pkg/asset/manifests/openshift.go, data/data/manifests/openshift/cloud-creds-secret.yaml.template
GCPWIFCredsSecretData struct added for external-account credentials; manifest generation conditionally produces external-account JSON from GCP project number when WIF enabled, bypassing manual-credentials-mode suppression for WIF; generateGCPExternalAccountJSON() helper resolves project number and constructs credential payload; template gains .CloudCreds.GCPWIF branch and Secret name handling.
Infrastructure Provisioning Dependencies
pkg/infrastructure/clusterapi/types.go, pkg/infrastructure/clusterapi/clusterapi.go, pkg/infrastructure/gcp/clusterapi/clusterapi.go
BoundSASigningKey field added to PreProvisionInput to carry RSA public key for OIDC; asset initialized and wired through dependency pipeline; ProvisionWIF() called when WIF enabled and not BYO.
WIF Infrastructure Provisioning
pkg/infrastructure/gcp/clusterapi/wif.go, pkg/infrastructure/gcp/clusterapi/wif_test.go
Creates workload identity pool, OIDC discovery bucket (public read, .well-known/openid-configuration), OIDC provider (issuer + allowed audience), and binds fixed OpenShift service accounts via roles/iam.workloadIdentityUser. Helpers: GetProjectNumber() resolves GCP project ID to number; GetBYOIssuerURL() fetches existing BYO provider issuer; GenerateJWKS() produces JSON Web Key Set from RSA public key (kid as base64url SHA-256); waitForIAMOperation() polls with exponential backoff. Tests verify naming, audience URI, OIDC discovery doc, and JWKS generation.
Cluster Destruction & WIF Cleanup
pkg/destroy/gcp/gcp.go, pkg/destroy/gcp/wif.go
ClusterUninstaller tracks WIF state; destruction stages added for WIF providers and pools. Lists, deletes, and tracks pending items; skips already-deleted resources; returns error if deletions remain pending after timeout.

🎯 4 (Complex) | ⏱️ ~75 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding GCP Workload Identity Federation support, which is the primary feature across multiple files in this changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Test files use static, descriptive names. No Ginkgo framework detected. No dynamic values, UUIDs, timestamps, pod/node names, or generated identifiers in test titles.
Test Structure And Quality ✅ Passed PR contains no Ginkgo tests; only standard Go tests with testify. The custom check for Ginkgo test quality is not applicable to this PR.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. All new tests are standard Go unit tests (wif_test.go, platform_test.go), so the check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. Only unit tests using testify/assert in wif_test.go and platform_test.go. Check not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds GCP WIF support via configuration and helpers only. No deployment manifests, operators, or scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed No process-level stdout writes found in 18 modified files. All fmt usage is fmt.Sprintf/Errorf, not stdout; no main(), init(), TestMain(), or Ginkgo suite setup patterns detected.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds only standard Go unit tests (TestXXX with testing.T), not Ginkgo e2e tests (It/Describe). Check not applicable.
No-Weak-Crypto ✅ Passed PR uses only approved strong cryptography (SHA-256, RSA, base64) and standard Go crypto libraries. No weak algorithms, custom crypto, or unsafe comparisons detected.
Container-Privileges ✅ Passed No container/K8s manifests with privileged settings found. PR contains only a Kubernetes Secret template and Go configuration code for GCP WIF support.
No-Sensitive-Data-In-Logs ✅ Passed All logging statements only log non-sensitive resource names (poolIDs, providerIDs, bucketNames) and service account emails. External account JSON credentials are never logged.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign vr4manta for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from barbacbd and bfournie June 8, 2026 17:23

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (1)
pkg/asset/manifests/openshift.go (1)

390-396: GCP project-number extraction is consistent, but strengthen validation instead of changing fields.

  • In the vendored cloudresourcemanager/v3 API, Project does not expose a ProjectNumber field; Project.Name is documented as projects/{project_number}, so project.Name[9:] matches the contract.
  • Still consider parsing more defensively: require the "projects/" prefix and validate the remainder is non-empty digits before using it in the WIF audience.
projectNumber := ""
if len(project.Name) > 9 {
	projectNumber = project.Name[9:]
}
if projectNumber == "" {
	return nil, fmt.Errorf("unexpected project name format: %s", project.Name)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/asset/manifests/openshift.go` around lines 390 - 396, The current
extraction of projectNumber from project.Name (projectNumber := ""; if
len(project.Name) > 9 { projectNumber = project.Name[9:] } ...) assumes the
format but lacks strict validation; update the logic that reads project.Name to
first verify it has the "projects/" prefix, then extract the suffix and validate
that the remainder is non-empty and consists only of digits before using it to
build the WIF audience or returning it; reference the project.Name and
projectNumber variables and the WIF audience construction so you replace the
naive slice-based extraction with prefix-checking and digit-validation and
return the same error if validation fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/asset/installconfig/gcp/client.go`:
- Around line 805-808: The IAM service used for WIF lookup is created via
GetIAMService without applying the endpoint override, causing mismatch with
GetServiceAccount which uses c.endpointName; update the call site in the
function that calls GetIAMService (where IAM service is stored in iamSvc) to
construct the IAM client with the same endpoint override as GetServiceAccount by
passing c.endpointName (or the equivalent endpoint option) into GetIAMService
(or use a variant that accepts client options), so the IAM service honors
c.endpointName for PSC/custom-endpoint environments.

In `@pkg/asset/installconfig/gcp/validation.go`:
- Around line 753-757: The current GetWIFProvider error handling treats all
failures as field.Invalid; change it to return field.NotFound when the lookup
indicates the provider is missing and field.InternalError for other
API/transport failures: call client.GetWIFProvider (using ic.GCP.ProjectID,
wif.PoolID, wif.ProviderID) and inspect the returned error (e.g., using the API
client's NotFound check or error type), then append either
field.NotFound(fldPath, wif.ProviderID, "...") for missing provider or
field.InternalError(fldPath, err.Error()) (or equivalent) for other errors
instead of always using field.Invalid.

In `@pkg/destroy/gcp/wif.go`:
- Around line 11-146: Add unit tests covering the new WIF destroy paths: write
tests for listWIFProviders, deleteWIFProvider, destroyWIFProviders and
listWIFPools, deleteWIFPool, destroyWIFPools that validate gating behavior (when
o.wifEnabled/o.wifBYO), correct list filtering (skip DELETED and match expected
pool prefix), operation/error handling (simulate iamSvc errors and isNoOp
cases), and pending-item lifecycle
(insertPendingItems/getPendingItems/deletePendingItems interaction and
suppression via errorTracker). Use a fake/mocked iamSvc responses and context
timeouts to assert logging/return values and that delete flows call the right
methods and mark items pending/cleared; include tests for error paths returning
wrapped errors and for no-op errors being treated as success.
- Around line 47-56: The code currently removes pending items and logs deletion
immediately after calling
o.iamSvc.Projects.Locations.WorkloadIdentityPools.Providers.Delete (and
similarly in the other block at lines ~116-125) even when the returned op is not
Done; change the flow to wait for the long‑running operation to complete (poll
or use the operation Get/Wait until op.Done is true or returns a non-retryable
error) before calling o.deletePendingItems(item.typeName, []cloudResource{item})
and o.Logger.Infof; preserve existing error handling with isNoOp and ensure you
handle op == nil cases, updating both the provider Delete call and the
corresponding delete path at the other block so resources are only marked
deleted after op.Done is observed.

In `@pkg/infrastructure/gcp/clusterapi/wif.go`:
- Around line 103-105: The OIDC provider is being configured with
AllowedAudiences derived from projectID while BuildAudienceURI and the WIF
credential flow use projectNumber, causing an audience mismatch; update the code
paths that construct AllowedAudiences (the code that calls createOIDCProvider
and the provider configuration in the functions that set AllowedAudiences) to
use projectNumber (or the same BuildAudienceURI result) instead of projectID so
the provider trusts the exact audience URI produced by BuildAudienceURI;
specifically, change uses of projectID when building AllowedAudiences to use
projectNumber (or call BuildAudienceURI and use that string) in the
functions/createOIDCProvider call sites referenced by providerName, poolName,
issuerURL, and infraID so the provider and WIF tokens share the same audience.
- Around line 179-180: The Workload Identity Federation create calls (e.g.,
iamSvc.Projects.Locations.WorkloadIdentityPools.Create and
WorkloadIdentityPoolProviders.Create where op, err := ...Do()) must be made
idempotent: on create failure detect "AlreadyExists"/HTTP 409 (googleapi.Error
with Code 409) and instead call the corresponding Get (or Do/Get method) to
reconcile the existing resource and continue rather than bubbling the error;
apply the same pattern to the subsequent create calls and any binding/role setup
around lines referenced (192-196, 241-242) so transient failures don’t leave the
installer stuck—i.e. try Get first or catch AlreadyExists, load the existing
resource, validate/merge needed fields, and proceed with the remaining steps.
- Around line 209-217: The discovery document is uploaded but the advertised
JWKS is not, so call GenerateJWKS and upload its output to the bucket at the
path referenced by generateOIDCDiscoveryDoc's jwks_uri (the code expects
"<issuer>/keys.json" not under .well-known); in createOIDCBucket (and the other
similar upload blocks around the other OIDC-bucket creation sites) add logic to
call GenerateJWKS(ctx), create a writer for "keys.json" (set ContentType
"application/json"), write the JWKS bytes, close the writer, and return wrapped
errors on write/close failures so Google can fetch the keyset for token
verification.

---

Nitpick comments:
In `@pkg/asset/manifests/openshift.go`:
- Around line 390-396: The current extraction of projectNumber from project.Name
(projectNumber := ""; if len(project.Name) > 9 { projectNumber =
project.Name[9:] } ...) assumes the format but lacks strict validation; update
the logic that reads project.Name to first verify it has the "projects/" prefix,
then extract the suffix and validate that the remainder is non-empty and
consists only of digits before using it to build the WIF audience or returning
it; reference the project.Name and projectNumber variables and the WIF audience
construction so you replace the naive slice-based extraction with
prefix-checking and digit-validation and return the same error if validation
fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0289485a-9446-4b05-991c-74503fbfa1f5

📥 Commits

Reviewing files that changed from the base of the PR and between 4a5404e and b1e8b83.

⛔ Files ignored due to path filters (2)
  • data/data/install.openshift.io_installconfigs.yaml is excluded by !data/data/install.openshift.io_installconfigs.yaml
  • pkg/types/gcp/zz_generated.deepcopy.go is excluded by !**/zz_generated*
📒 Files selected for processing (18)
  • data/data/manifests/openshift/cloud-creds-secret.yaml.template
  • pkg/asset/cluster/gcp/gcp.go
  • pkg/asset/installconfig/azure/mock/azureclient_generated.go
  • pkg/asset/installconfig/gcp/client.go
  • pkg/asset/installconfig/gcp/mock/gcpclient_generated.go
  • pkg/asset/installconfig/gcp/validation.go
  • pkg/asset/manifests/openshift.go
  • pkg/asset/manifests/template.go
  • pkg/destroy/gcp/gcp.go
  • pkg/destroy/gcp/wif.go
  • pkg/infrastructure/gcp/clusterapi/clusterapi.go
  • pkg/infrastructure/gcp/clusterapi/wif.go
  • pkg/infrastructure/gcp/clusterapi/wif_test.go
  • pkg/types/defaults/installconfig.go
  • pkg/types/gcp/metadata.go
  • pkg/types/gcp/platform.go
  • pkg/types/gcp/validation/platform.go
  • pkg/types/gcp/validation/platform_test.go

Comment thread pkg/asset/installconfig/gcp/client.go Outdated
Comment thread pkg/asset/installconfig/gcp/validation.go
Comment thread pkg/destroy/gcp/wif.go
Comment thread pkg/destroy/gcp/wif.go
Comment thread pkg/infrastructure/gcp/clusterapi/wif.go
Comment thread pkg/infrastructure/gcp/clusterapi/wif.go
Comment thread pkg/infrastructure/gcp/clusterapi/wif.go Outdated
rochacbruno and others added 4 commits June 8, 2026 18:40
Use fmt.Errorf with %w in destroy/gcp/wif.go instead of deprecated
errors.Wrapf from github.com/pkg/errors. Remove the unrelated Azure
mock cosmetic diff that was pulled in by hack/go-genmock.sh.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Upload JWKS (keys.json) to the OIDC GCS bucket so GCP can validate
projected service account tokens. The bound SA signing key must be
provided by the user via bound-service-account-signing-key.key in
the asset directory.

Thread the BoundSASigningKey asset through PreProvisionInput so the
GCP WIF provisioning flow can extract the public key and generate
the JSON Web Key Set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix audience mismatch: use projectNumber instead of projectID when
  building the OIDC provider AllowedAudiences, matching the audience
  URI in the generated external_account credentials
- Add PSC endpoint override to GetWIFProvider for custom-endpoint
  environments
- Wait for WIF delete LROs before marking resources as deleted in the
  destroy flow, preventing premature cleanup
- Make WIF pool and provider creation idempotent by handling 409
  AlreadyExists errors
- Use field.NotFound for missing WIF providers and field.InternalError
  for API failures instead of generic field.Invalid

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Handle error return from generateOIDCDiscoveryDoc (errcheck)
- Annotate external_account type string as not a hardcoded credential
  (gosec G101)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@rochacbruno: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-ovn-byo-vpc ec29fb7 link false /test e2e-gcp-ovn-byo-vpc
ci/prow/e2e-gcp-xpn-custom-dns ec29fb7 link false /test e2e-gcp-xpn-custom-dns
ci/prow/e2e-gcp-custom-endpoints ec29fb7 link false /test e2e-gcp-custom-endpoints
ci/prow/e2e-gcp-secureboot ec29fb7 link false /test e2e-gcp-secureboot
ci/prow/e2e-aws-ovn ec29fb7 link true /test e2e-aws-ovn
ci/prow/e2e-gcp-custom-dns ec29fb7 link false /test e2e-gcp-custom-dns
ci/prow/e2e-gcp-ovn-xpn ec29fb7 link false /test e2e-gcp-ovn-xpn
ci/prow/e2e-gcp-default-config ec29fb7 link false /test e2e-gcp-default-config
ci/prow/gcp-custom-endpoints-proxy-wif ec29fb7 link false /test gcp-custom-endpoints-proxy-wif
ci/prow/e2e-gcp-xpn-dedicated-dns-project ec29fb7 link false /test e2e-gcp-xpn-dedicated-dns-project
ci/prow/e2e-gcp-ovn ec29fb7 link true /test e2e-gcp-ovn
ci/prow/gcp-private ec29fb7 link false /test gcp-private

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

workerIgnAsset := &machine.Worker{}
tfvarsAsset := &tfvars.TerraformVariables{}
rootCA := &tls.RootCA{}
boundSASigningKey := &tls.BoundSASigningKey{}

@rochacbruno rochacbruno Jun 9, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 altearnatives to this:

  1. Generate the key pair inside the installer

Instead of requiring the user to provide bound-service-account-signing-key.key, make the BoundSASigningKey asset's Generate() method actually produce an RSA key pair when WIF is enabled (currently it's a no-op). The private key would flow into bootstrap ignition so kube-apiserver uses it for SA token signing, and the public key would be uploaded as JWKS to the OIDC bucket.

The challenge: BoundSASigningKey.Generate() currently has no access to the install config (its Dependencies() returns nil). We'd need to add InstallConfig as a dependency and conditionally generate only when WIF is active. This changes the asset DAG, but it's the cleanest long-term approach since the user doesn't need to know about signing keys at all.

  1. Defer JWKS upload to post-bootstrap

Create the WIF pool, bucket, and OIDC provider during PreProvision but leave the JWKS empty. After bootstrap, when the cluster has generated its own SA signing key, upload the JWKS from a PostProvision hook, or let the cloud-credential-operator handle it.

The trade-off: WIF token validation won't work until after bootstrap completes and the JWKS is uploaded. That's probably fine since WIF credentials aren't needed during bootstrap itself - they're consumed by operators that start after the control plane is up.

  1. Use inline JWKS on the OIDC provider instead of the bucket

The WorkloadIdentityPoolProvider has a JwksJson field. When set, GCP validates tokens against that inline JWKS instead of fetching from jwks_uri. This eliminates the need for the bucket to serve keys.json at all, though we'd still need the public key from somewhere.

This could combine with option 2 - create the provider without JwksJson initially, then patch it post-bootstrap once the signing key exists.

@rochacbruno rochacbruno marked this pull request as draft June 9, 2026 17:26
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants