diff --git a/.claude/commands/dashboard-dev.md b/.claude/commands/dashboard-dev.md new file mode 100644 index 0000000000..4d3dfb49e1 --- /dev/null +++ b/.claude/commands/dashboard-dev.md @@ -0,0 +1,18 @@ +--- +description: Guide for contributing to and deploying the Delivery Dashboard +--- + +# Dashboard Dev Command + +Load the dashboard-dev skill and assist with the user's request. + +## Execution + +Load and follow the dashboard-dev skill from `.claude/skills/dashboard-dev/SKILL.md`. + +Use it to help with: +- Forking and setting up the repo for development +- Running the dashboard locally +- Deploying to an OpenShift cluster +- Adding new pages, queries, or features +- Debugging a running deployment on cluster \ No newline at end of file diff --git a/.claude/skills/dashboard-dev/SKILL.md b/.claude/skills/dashboard-dev/SKILL.md new file mode 100644 index 0000000000..01ad304274 --- /dev/null +++ b/.claude/skills/dashboard-dev/SKILL.md @@ -0,0 +1,128 @@ +--- +name: dashboard-dev +description: Guide for contributing to and deploying the Delivery Dashboard +allowed-tools: [Bash, Read, Grep, Glob, Write, Edit, TodoWrite] +--- + +# Delivery Dashboard Development Skill + +## Purpose + +Help developers contribute to, run locally, and deploy the Delivery Dashboard — a web UI showing operator pipeline status across stage and integration environments, backed by SQLite, SQS, and S3. + +--- + +## Codebase Layout + +``` +pkg/dashboard/ + models/types.go # data models (PipelineRun, FailureGroup, etc.) + store/store.go # SQLite queries + server/server.go # HTTP handlers and routes + server/templates/ # Go HTML templates + base.html # nav, layout + operators.html # deliverables/pipelines page + pipeline-detail.html # per-operator history + analysis.html # failure grouping by AI root cause + usage.html # infra/clusters page +cmd/osde2e/dashboard/ # CLI entry point (flags, wiring) +scripts/dashboard/ + deploy.sh # local dev deploy to OpenShift cluster + verify-build.sh # sanity check binary + templates +configs/local/ + dashboard-build/ # podman build context (Dockerfile committed, binary gitignored) +``` + +Manifests live in the adjacent **hp-delivery-apps** repo: +``` +delivery-dashboard/ + base/ # Deployment + Service + overlays/ + local/ # personal dev cluster (gitignored, manually provisioned secrets) + stage/ # vault ExternalSecrets + prod/ # vault ExternalSecrets +``` + +--- + +## Local Development (native, no container) + +```bash +make dashboard +``` + +Builds the binary and runs it at http://localhost:8080/dashboard/deliverables against `./dashboard.db`. + + +## Deploying to Your Own OpenShift Cluster + +### Prerequisites + +- `podman login quay.io` +- `oc login ` +- hp-delivery-apps repo cloned adjacent to this repo +- Secrets pre-created in the target namespace (see hp-delivery-apps/delivery-dashboard/README.md) + +### Create secrets (local overlay — vault handles stage/prod automatically) + +```bash +oc create secret generic osde2e-ocm-credentials \ + --from-literal=ocm-client-id= \ + --from-literal=ocm-client-secret= \ + -n + +oc create secret generic osde2e-aws-credentials \ + --from-literal=aws-access-key-id= \ + --from-literal=aws-secret-access-key= \ + -n +``` + +### Set SQS_QUEUE_URL + +Edit `hp-delivery-apps/delivery-dashboard/overlays/local/configmap.yaml` directly — it is gitignored. + +### Deploy + +```bash +DASHBOARD_QUAY_IMAGE=quay.io//delivery-dashboard:latest \ + QUAY_EXPIRE=26w \ + ./scripts/dashboard/deploy.sh +``` + +The script: +1. Checks required secrets exist (fails fast if not) +2. Compiles linux/amd64 binary → `configs/local/dashboard-build/osde2e` +3. Builds slim image via podman and pushes to quay +4. Applies `kustomize build overlays/local | oc apply` +5. Waits for rollout, prints URL + +Route URL: `https://live-.apps./dashboard/deliverables` + +### When to rebuild vs re-apply + +| Change type | Action | +|-------------|--------| +| Go source / templates | Re-run `deploy.sh` | +| ConfigMap / env vars | Edit overlay configmap, `kustomize build \| oc apply -f -` | +| Route / Service | Same as above, no restart needed | + +--- + +## Common Development Tasks + +- **Add a new page**: template in `server/templates/`, handler in `server.go`, route in `setupRoutes()`, nav link in `base.html` +- **Add a data query**: method in `store/store.go`, model in `models/types.go` +- **Check logs**: `oc logs -f deployment/delivery-dashboard -n ` +- **Check pod status**: `oc get pods -n ` +- **Rolling restart**: `oc rollout restart deployment/delivery-dashboard -n ` + +--- + +## Architecture + +- **Pipeline data**: SQS listener polls for S3 event notifications; each event points to a test result JSON, downloaded and parsed into `pipeline_runs` SQLite table +- **Pipeline Backfill**: on startup with `--backfill`, scans S3 bucket directly for historical results +- **Pipeline LLM analysis**: stored in `llm_analysis` column as JSON; parsed to extract `root_cause` and `recommendations` +- **OCM data**: collectors query OCM API for cluster reserves, usage metrics, and environment status (stage/int/prod) +- **Local Storage**: single SQLite file at `/data/dashboard.db`, mounted via `emptyDir` (repopulated from S3 + OCM on each start) +- **UI Templates**: standard Go `html/template`, server-side rendered, no JS framework \ No newline at end of file diff --git a/AGENTS.md b/AGENTS.md index cf53f719cf..897c5243d4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -57,13 +57,16 @@ Osde2e is End-to-end testing framework for Managed services for OSD/ROSA. 3. Integration test failures? Check credentials/env vars 4. Always use `gofumpt`, not `gofmt` 5. Check git status before committing +6. Dashboard work? Use the `/dashboard-dev` skill — it has deploy steps, architecture, and local dev instructions ## Architecture ``` osde2e -├── cmd/osde2e/ # CLI commands (provision, test, cleanup, krknai) +├── cmd/osde2e/ # CLI commands (provision, test, cleanup, krknai, dashboard) ├── pkg/common/ # Core logic (config, providers, helpers) +├── pkg/dashboard/ # Delivery Dashboard (server, store, collectors, models) ├── internal/ # LLM analysis (llm, sanitizer, prompts) +├── .claude/skills/ # Claude Code skills (use /dashboard-dev for dashboard work) └── test/ # Standalone Ginkgo test suites ``` diff --git a/Makefile b/Makefile index 7a690fa26a..c6ddf8434e 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -.PHONY: check generate test +.PHONY: check generate test dashboard PKG := github.com/openshift/osde2e DOC_PKG := $(PKG)/cmd/osde2e-docs @@ -37,6 +37,9 @@ build: mkdir -p "$(OUT_DIR)" go build -o "$(OUT_DIR)" "$(DIR)cmd/..." +dashboard: build + "$(OUT_DIR)/osde2e" dashboard --db="$(DIR)dashboard.db" --backfill --port=8080 + diffproviders.txt: "$(DIR)scripts/generate-providers-import.sh" > diffproviders.txt diff --git a/cmd/osde2e/dashboard/cmd.go b/cmd/osde2e/dashboard/cmd.go new file mode 100644 index 0000000000..69ed05bf70 --- /dev/null +++ b/cmd/osde2e/dashboard/cmd.go @@ -0,0 +1,190 @@ +package dashboard + +import ( + "context" + "fmt" + "log" + "os" + "os/signal" + "syscall" + + "github.com/openshift/osde2e/cmd/osde2e/common" + "github.com/openshift/osde2e/cmd/osde2e/helpers" + viper "github.com/openshift/osde2e/pkg/common/concurrentviper" + "github.com/openshift/osde2e/pkg/common/providers/ocmprovider" + "github.com/openshift/osde2e/pkg/dashboard/collectors" + "github.com/openshift/osde2e/pkg/dashboard/config" + "github.com/openshift/osde2e/pkg/dashboard/server" + "github.com/openshift/osde2e/pkg/dashboard/store" + "github.com/spf13/cobra" +) + +var Cmd = &cobra.Command{ + Use: "dashboard", + Short: "Start osde2e dashboard web server", + Long: "Start a web dashboard that aggregates cluster reserves, usage metrics, and test results from OCM and S3.", + Args: cobra.NoArgs, + Run: run, +} + +var args struct { + configString string + secretLocations string + environment string + port int + maxResults int + sqsQueueURL string + dbPath string + backfill bool +} + +func init() { + pfs := Cmd.PersistentFlags() + + pfs.StringVar(&args.configString, "configs", "", "A comma separated list of built in configs to use") + _ = Cmd.RegisterFlagCompletionFunc("configs", helpers.ConfigComplete) + + pfs.StringVar(&args.secretLocations, "secret-locations", "", + "A comma separated list of possible secret directory locations for loading secret configs.") + + pfs.StringVarP(&args.environment, "environment", "e", "", + "Filter clusters by environment (stage, prod, integration, all). Defaults to 'all'.") + + pfs.IntVarP(&args.port, "port", "p", config.DefaultPort, "HTTP port for the dashboard server") + + pfs.IntVar(&args.maxResults, "max-results", config.DefaultMaxTestResults, + "Maximum number of test results to display") + + pfs.StringVar(&args.sqsQueueURL, "sqs-queue-url", "", + "SQS queue URL receiving S3 ObjectCreated notifications. When set, enables event-driven DB updates.") + + pfs.StringVar(&args.dbPath, "db", "dashboard.db", + "Path to the SQLite database file. Use ':memory:' for an ephemeral in-memory DB.") + + pfs.BoolVar(&args.backfill, "backfill", false, + "Scan all historical S3 objects and populate the DB before starting the server.") + + // Bind flags to viper + _ = viper.BindPFlag(config.Port, pfs.Lookup("port")) + _ = viper.BindPFlag(config.Environment, pfs.Lookup("environment")) + _ = viper.BindPFlag(config.MaxTestResults, pfs.Lookup("max-results")) + _ = viper.BindPFlag(ocmprovider.Env, pfs.Lookup("environment")) + _ = viper.BindPFlag(config.SQSQueueURL, pfs.Lookup("sqs-queue-url")) + _ = viper.BindPFlag(config.DBPath, pfs.Lookup("db")) +} + +func run(cmd *cobra.Command, argv []string) { + log.Println("==== Starting osde2e Dashboard ====") + + // Unset personal OCM token so the dashboard authenticates via OCM_CLIENT_ID/SECRET only. + os.Unsetenv("OCM_TOKEN") + + // Load configurations + if err := common.LoadConfigs(args.configString, "", args.secretLocations); err != nil { + log.Printf("Error loading initial configuration: %v", err) + os.Exit(1) + } + + // Set dashboard defaults + config.SetDefaults() + + // Override with CLI flags if explicitly set + if cmd.PersistentFlags().Changed("port") { + viper.Set(config.Port, args.port) + } + if cmd.PersistentFlags().Changed("environment") { + viper.Set(config.Environment, args.environment) + viper.Set(ocmprovider.Env, args.environment) + } + if cmd.PersistentFlags().Changed("max-results") { + viper.Set(config.MaxTestResults, args.maxResults) + } + if cmd.PersistentFlags().Changed("sqs-queue-url") { + viper.Set(config.SQSQueueURL, args.sqsQueueURL) + } + if cmd.PersistentFlags().Changed("db") { + viper.Set(config.DBPath, args.dbPath) + } + + // Load dashboard configuration + dashboardConfig := config.LoadConfig() + + // Validate configuration + if dashboardConfig.OCMConfigPath == "" { + log.Println("Warning: OCM_CONFIG not set. OCM features may not work.") + } + if dashboardConfig.S3Bucket == "" { + log.Println("Warning: LOG_BUCKET not set. S3 test results will not be available.") + } + + log.Printf("Dashboard Configuration:") + log.Printf(" Port: %d", dashboardConfig.Port) + log.Printf(" S3 Bucket: %s", dashboardConfig.S3Bucket) + log.Printf(" S3 Region: %s", dashboardConfig.S3Region) + log.Printf(" Environment: %s", dashboardConfig.Environment) + log.Printf(" DB Path: %s", dashboardConfig.DBPath) + log.Printf(" SQS Queue URL: %s", dashboardConfig.SQSQueueURL) + + // Open the SQLite store + st, err := store.Open(dashboardConfig.DBPath) + if err != nil { + log.Printf("Failed to open store at %s: %v", dashboardConfig.DBPath, err) + os.Exit(1) + } + defer st.Close() + + // Top-level context — cancelled on Ctrl+C or SIGTERM, shuts down everything. + ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM) + defer cancel() + + // Optionally backfill historical S3 data into the DB + if args.backfill || dashboardConfig.SQSQueueURL != "" { + if dashboardConfig.S3Bucket == "" { + log.Println("Warning: --backfill requested but LOG_BUCKET is not set; skipping.") + } else { + consumer, err := collectors.NewSQSConsumer( + dashboardConfig.SQSQueueURL, + dashboardConfig.S3Bucket, + dashboardConfig.S3Region, + st, + ) + if err != nil { + log.Printf("Warning: failed to create SQS consumer: %v", err) + } else { + if args.backfill { + log.Println("Truncating DB before backfill...") + if err := st.Truncate(); err != nil { + log.Printf("Warning: truncate failed: %v", err) + } + log.Println("Running backfill — this may take a few minutes...") + if err := consumer.Backfill(); err != nil { + log.Printf("Backfill error: %v", err) + } + } + + // Start the SQS consumer goroutine (only when queue URL is configured) + if dashboardConfig.SQSQueueURL != "" { + go consumer.Run(ctx) + log.Printf("SQS consumer started") + } + } + } + } + + // Create and start the HTTP server + srv, err := server.NewServer(dashboardConfig) + if err != nil { + log.Printf("Failed to create dashboard server: %v", err) + os.Exit(1) + } + srv.WithStore(st) + + addr := fmt.Sprintf(":%d", dashboardConfig.Port) + log.Printf("Dashboard server starting on http://localhost%s", addr) + log.Printf("Press Ctrl+C to stop") + + if err := srv.Start(addr, ctx); err != nil { + log.Printf("Server error: %v", err) + os.Exit(1) + } +} diff --git a/cmd/osde2e/main.go b/cmd/osde2e/main.go index e46c6fc892..a527bb34e0 100644 --- a/cmd/osde2e/main.go +++ b/cmd/osde2e/main.go @@ -16,6 +16,7 @@ import ( "github.com/openshift/osde2e/cmd/osde2e/arguments" "github.com/openshift/osde2e/cmd/osde2e/cleanup" "github.com/openshift/osde2e/cmd/osde2e/completion" + "github.com/openshift/osde2e/cmd/osde2e/dashboard" "github.com/openshift/osde2e/cmd/osde2e/healthcheck" "github.com/openshift/osde2e/cmd/osde2e/krknai" "github.com/openshift/osde2e/cmd/osde2e/provision" @@ -46,6 +47,7 @@ func init() { root.AddCommand(completion.Cmd) root.AddCommand(cleanup.Cmd) root.AddCommand(krknai.Cmd) + root.AddCommand(dashboard.Cmd) } func main() { diff --git a/configs/local/dashboard-build/.gitignore b/configs/local/dashboard-build/.gitignore new file mode 100644 index 0000000000..d83f64bc1b --- /dev/null +++ b/configs/local/dashboard-build/.gitignore @@ -0,0 +1 @@ +osde2e \ No newline at end of file diff --git a/configs/local/dashboard-build/Dockerfile b/configs/local/dashboard-build/Dockerfile new file mode 100644 index 0000000000..0d89d01542 --- /dev/null +++ b/configs/local/dashboard-build/Dockerfile @@ -0,0 +1,13 @@ +FROM registry.access.redhat.com/ubi9/ubi-minimal:latest +WORKDIR / +COPY osde2e /osde2e +ENV PATH="${PATH}:/" +ENTRYPOINT ["/osde2e"] + +LABEL name="delivery-dashboard" +LABEL description="Delivery Dashboard — pipeline status for Service Delivery operators, sourced from S3 and SQS" +LABEL summary="Web dashboard showing operator pipeline status across stage and integration environments" +LABEL com.redhat.component="delivery-dashboard" +LABEL io.k8s.description="delivery-dashboard" +LABEL io.k8s.display-name="Delivery Dashboard" +LABEL io.openshift.tags="dashboard,delivery,operators" \ No newline at end of file diff --git a/dashboard.Dockerfile b/dashboard.Dockerfile new file mode 100644 index 0000000000..fda72d9b4f --- /dev/null +++ b/dashboard.Dockerfile @@ -0,0 +1,28 @@ +FROM registry.access.redhat.com/ubi9/go-toolset:latest AS builder + +USER root +ENV GOFLAGS= +ENV PKG=/opt/app-root/src/github.com/openshift/osde2e/ +WORKDIR ${PKG} + +COPY go.* . +RUN go mod download +COPY . . +RUN go env +RUN make build + +FROM registry.access.redhat.com/ubi9/ubi-minimal:latest +WORKDIR / + +COPY --from=builder /opt/app-root/src/github.com/openshift/osde2e/out/osde2e . + +ENV PATH="${PATH}:/" +ENTRYPOINT ["/osde2e"] + +LABEL name="delivery-dashboard" +LABEL description="Delivery Dashboard — pipeline status for Service Delivery operators, sourced from S3 and SQS" +LABEL summary="Web dashboard showing operator pipeline status across stage and integration environments" +LABEL com.redhat.component="delivery-dashboard" +LABEL io.k8s.description="delivery-dashboard" +LABEL io.k8s.display-name="Delivery Dashboard" +LABEL io.openshift.tags="dashboard,delivery,operators" diff --git a/go.mod b/go.mod index 442f166d64..8d509e2fc4 100644 --- a/go.mod +++ b/go.mod @@ -36,10 +36,10 @@ require ( github.com/spf13/pflag v1.0.9 github.com/spf13/viper v1.19.0 github.com/vmware-tanzu/velero v1.10.2 - golang.org/x/net v0.49.0 + golang.org/x/net v0.50.0 golang.org/x/oauth2 v0.34.0 // indirect - golang.org/x/sync v0.19.0 - golang.org/x/tools v0.41.0 + golang.org/x/sync v0.20.0 + golang.org/x/tools v0.42.0 google.golang.org/api v0.227.0 google.golang.org/genproto v0.0.0-20250409194420-de1ac958c67a // indirect gopkg.in/yaml.v3 v3.0.1 @@ -62,6 +62,7 @@ require ( github.com/openshift/api v0.0.0-20260318185450-1f2fa3f09f4e github.com/stretchr/testify v1.11.1 google.golang.org/genai v1.51.0 + modernc.org/sqlite v1.52.0 ) require ( @@ -97,6 +98,7 @@ require ( github.com/cespare/xxhash/v2 v2.3.0 // indirect github.com/cloudflare/circl v1.6.3 // indirect github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect + github.com/dustin/go-humanize v1.0.1 // indirect github.com/emicklei/go-restful/v3 v3.12.2 // indirect github.com/evanphx/json-patch/v5 v5.9.11 // indirect github.com/felixge/httpsnoop v1.0.4 // indirect @@ -127,11 +129,13 @@ require ( github.com/json-iterator/go v1.1.12 // indirect github.com/magiconair/properties v1.8.7 // indirect github.com/mailru/easyjson v0.7.7 // indirect + github.com/mattn/go-isatty v0.0.20 // indirect github.com/moby/spdystream v0.5.0 // indirect github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f // indirect + github.com/ncruces/go-strftime v1.0.0 // indirect github.com/openshift-online/ocm-api-model/clientapi v0.0.453 // indirect github.com/openshift-online/ocm-api-model/model v0.0.453 // indirect github.com/openshift/library-go v0.0.0-20260311094140-ac826d10cb40 // indirect @@ -139,6 +143,7 @@ require ( github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect github.com/prometheus/client_model v0.6.2 // indirect github.com/prometheus/procfs v0.16.1 // indirect + github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect github.com/sagikazarmark/locafero v0.4.0 // indirect github.com/sagikazarmark/slog-shim v0.1.0 // indirect github.com/sirupsen/logrus v1.9.3 // indirect @@ -157,12 +162,12 @@ require ( go.uber.org/multierr v1.11.0 // indirect go.yaml.in/yaml/v2 v2.4.3 // indirect go.yaml.in/yaml/v3 v3.0.4 // indirect - golang.org/x/crypto v0.47.0 // indirect + golang.org/x/crypto v0.48.0 // indirect golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect - golang.org/x/mod v0.32.0 // indirect - golang.org/x/sys v0.40.0 // indirect - golang.org/x/term v0.39.0 // indirect - golang.org/x/text v0.33.0 // indirect + golang.org/x/mod v0.33.0 // indirect + golang.org/x/sys v0.42.0 // indirect + golang.org/x/term v0.40.0 // indirect + golang.org/x/text v0.34.0 // indirect golang.org/x/time v0.12.0 // indirect google.golang.org/genproto/googleapis/api v0.0.0-20251202230838-ff82c1b0f217 // indirect google.golang.org/genproto/googleapis/rpc v0.0.0-20251202230838-ff82c1b0f217 // indirect @@ -174,6 +179,9 @@ require ( k8s.io/apiextensions-apiserver v0.35.1 // indirect k8s.io/component-base v0.35.2 // indirect k8s.io/kube-openapi v0.0.0-20250910181357-589584f1c912 // indirect + modernc.org/libc v1.72.3 // indirect + modernc.org/mathutil v1.7.1 // indirect + modernc.org/memory v1.11.0 // indirect sigs.k8s.io/json v0.0.0-20250730193827-2d320260d730 // indirect sigs.k8s.io/randfill v1.0.0 // indirect sigs.k8s.io/structured-merge-diff/v6 v6.3.0 // indirect diff --git a/go.sum b/go.sum index 455b129200..92886d17b1 100644 --- a/go.sum +++ b/go.sum @@ -85,6 +85,8 @@ github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSs github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM= github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY= +github.com/dustin/go-humanize v1.0.1/go.mod h1:Mu1zIs6XwVuF/gI1OepvI0qD18qycQx+mFykh5fBlto= github.com/emicklei/go-restful/v3 v3.12.2 h1:DhwDP0vY3k8ZzE0RunuJy8GhNpPL6zqLkDf9B/a0/xU= github.com/emicklei/go-restful/v3 v3.12.2/go.mod h1:6n3XBCmQQb25CM2LCACGz8ukIrRry+4bhvbpWn3mrbc= github.com/emirpasic/gods v1.18.1 h1:FXtiHYKDGKCW2KzwZKx0iC0PQmdlorYgdFG9jPXJ1Bc= @@ -187,6 +189,9 @@ github.com/hashicorp/go-retryablehttp v0.7.8 h1:ylXZWnqa7Lhqpk0L1P1LzDtGcCR0rPVU github.com/hashicorp/go-retryablehttp v0.7.8/go.mod h1:rjiScheydd+CxvumBsIrFKlx3iS0jrZ7LvzFGFmuKbw= github.com/hashicorp/go-version v1.7.0 h1:5tqGy27NaOTB8yJKUZELlFAS/LTKJkrmONwQKeRZfjY= github.com/hashicorp/go-version v1.7.0/go.mod h1:fltr4n8CU8Ke44wwGCBoEymUuxUHl09ZGVZPK5anwXA= +github.com/hashicorp/golang-lru v0.5.4 h1:YDjusn29QI/Das2iO9M0BHnIbxPeyuCHsjMW+lJfyTc= +github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k= +github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM= github.com/hashicorp/hc-install v0.9.2 h1:v80EtNX4fCVHqzL9Lg/2xkp62bbvQMnvPQ0G+OmtO24= github.com/hashicorp/hc-install v0.9.2/go.mod h1:XUqBQNnuT4RsxoxiM9ZaUk0NX8hi2h+Lb6/c0OZnC/I= github.com/hashicorp/hcl v1.0.0 h1:0Anlzjpi4vEasTeNFn2mLJgTSwt0+6sfsiTG8qcWGx4= @@ -257,6 +262,8 @@ github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f h1:KUppIJq7/+ github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f h1:y5//uYreIhSUg3J1GEMiLbxo1LJaP8RfCpH6pymGZus= github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f/go.mod h1:ZdcZmHo+o7JKHSa8/e818NopupXU1YMK5fe1lsApnBw= +github.com/ncruces/go-strftime v1.0.0 h1:HMFp8mLCTPp341M/ZnA4qaf7ZlsbTc+miZjCLOFAw7w= +github.com/ncruces/go-strftime v1.0.0/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls= github.com/onsi/ginkgo/v2 v2.28.1 h1:S4hj+HbZp40fNKuLUQOYLDgZLwNUVn19N3Atb98NCyI= github.com/onsi/ginkgo/v2 v2.28.1/go.mod h1:CLtbVInNckU3/+gC8LzkGUb9oF+e8W8TdUsxPwvdOgE= github.com/onsi/gomega v1.39.1 h1:1IJLAad4zjPn2PsnhH70V4DKRFlrCzGBNrNaru+Vf28= @@ -304,6 +311,8 @@ github.com/prometheus/common v0.67.5 h1:pIgK94WWlQt1WLwAC5j2ynLaBRDiinoAb86HZHTU github.com/prometheus/common v0.67.5/go.mod h1:SjE/0MzDEEAyrdr5Gqc6G+sXI67maCxzaT3A2+HqjUw= github.com/prometheus/procfs v0.16.1 h1:hZ15bTNuirocR6u0JZ6BAHHmwS1p8B4P6MRqxtzMyRg= github.com/prometheus/procfs v0.16.1/go.mod h1:teAbpZRB1iIAJYREa1LsoWUXykVXA1KlTmWl8x/U+Is= +github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94icq4NjY3clb7Lk8O1qJ8BdBEF8z0ibU0rE= +github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo= github.com/rogpeppe/go-internal v1.14.1 h1:UQB4HGPB6osV0SQTLymcB4TgvyWu6ZyliaW0tI/otEQ= github.com/rogpeppe/go-internal v1.14.1/go.mod h1:MaRKkUm5W0goXpeCfT7UZI6fk/L7L7so1lCWt35ZSgc= github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= @@ -399,8 +408,8 @@ golang.org/x/crypto v0.13.0/go.mod h1:y6Z2r+Rw4iayiXXAIxJIDAJ1zMW4yaTpebo8fPOliY golang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDfU= golang.org/x/crypto v0.23.0/go.mod h1:CKFgDieR+mRhux2Lsu27y0fO304Db0wZe70UKqHu0v8= golang.org/x/crypto v0.31.0/go.mod h1:kDsLvtWBEx7MV9tJOj9bnXsPbxwJQ6csT/x4KIN4Ssk= -golang.org/x/crypto v0.47.0 h1:V6e3FRj+n4dbpw86FJ8Fv7XVOql7TEwpHapKoMJ/GO8= -golang.org/x/crypto v0.47.0/go.mod h1:ff3Y9VzzKbwSSEzWqJsJVBnWmRwRSHt/6Op5n9bQc4A= +golang.org/x/crypto v0.48.0 h1:/VRzVqiRSggnhY7gNRxPauEQ5Drw9haKdM0jqfcCFts= +golang.org/x/crypto v0.48.0/go.mod h1:r0kV5h3qnFPlQnBSrULhlsRfryS2pmewsg+XfMgkVos= golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 h1:R84qjqJb5nVJMxqWYb3np9L5ZsaDtB+a39EqjV0JSUM= golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0/go.mod h1:S9Xr4PYopiDyqSyp5NjCrhFrqg6A5zA2E/iPHPhqnS8= golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4= @@ -408,8 +417,8 @@ golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/mod v0.15.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c= golang.org/x/mod v0.17.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c= -golang.org/x/mod v0.32.0 h1:9F4d3PHLljb6x//jOyokMv3eX+YDeepZSEo3mFJy93c= -golang.org/x/mod v0.32.0/go.mod h1:SgipZ/3h2Ci89DlEtEXWUk/HteuRin+HHhN+WbNhguU= +golang.org/x/mod v0.33.0 h1:tHFzIWbBifEmbwtGz65eaWyGiGZatSrT9prnU8DbVL8= +golang.org/x/mod v0.33.0/go.mod h1:swjeQEj+6r7fODbD2cqrnje9PnziFuw4bmLbBZFrQ5w= golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg= golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg= @@ -420,8 +429,8 @@ golang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk= golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44= golang.org/x/net v0.25.0/go.mod h1:JkAGAh7GEvH74S6FOH42FLoXpXbE/aqXSrIQjXgsiwM= golang.org/x/net v0.33.0/go.mod h1:HXLR5J+9DxmrqMwG9qjGCxZ+zKXxBru04zlTvWlWuN4= -golang.org/x/net v0.49.0 h1:eeHFmOGUTtaaPSGNmjBKpbng9MulQsJURQUAfUwY++o= -golang.org/x/net v0.49.0/go.mod h1:/ysNB2EvaqvesRkuLAyjI1ycPZlQHM3q01F02UY/MV8= +golang.org/x/net v0.50.0 h1:ucWh9eiCGyDR3vtzso0WMQinm2Dnt8cFMuQa9K33J60= +golang.org/x/net v0.50.0/go.mod h1:UgoSli3F/pBgdJBHCTc+tp3gmrU4XswgGRgtnwWTfyM= golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= golang.org/x/oauth2 v0.34.0 h1:hqK/t4AKgbqWkdkcAeI8XLmbK+4m4G5YeQRrmiotGlw= golang.org/x/oauth2 v0.34.0/go.mod h1:lzm5WQJQwKZ3nwavOZ3IS5Aulzxi68dUSgRHujetwEA= @@ -432,8 +441,8 @@ golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y= golang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= golang.org/x/sync v0.10.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= -golang.org/x/sync v0.19.0 h1:vV+1eWNmZ5geRlYjzm2adRgW2/mcpevXNg50YZtPCE4= -golang.org/x/sync v0.19.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI= +golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4= +golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0= golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= @@ -441,13 +450,14 @@ golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBc golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= +golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= golang.org/x/sys v0.20.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= golang.org/x/sys v0.28.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= -golang.org/x/sys v0.40.0 h1:DBZZqJ2Rkml6QMQsZywtnjnnGvHza6BTfYFWY9kjEWQ= -golang.org/x/sys v0.40.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks= +golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo= +golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw= golang.org/x/telemetry v0.0.0-20240228155512-f48c80bd79b2/go.mod h1:TeRTkGYfJXctD9OcfyVLyj2J3IxLnKwHJR8f4D8a3YE= golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= @@ -457,8 +467,8 @@ golang.org/x/term v0.12.0/go.mod h1:owVbMEjm3cBLCHdkQu9b1opXd4ETQWc3BhuQGKgXgvU= golang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk= golang.org/x/term v0.20.0/go.mod h1:8UkIAJTvZgivsXaD6/pH6U9ecQzZ45awqEOzuCvwpFY= golang.org/x/term v0.27.0/go.mod h1:iMsnZpn0cago0GOrHO2+Y7u7JPn5AylBrcoWkElMTSM= -golang.org/x/term v0.39.0 h1:RclSuaJf32jOqZz74CkPA9qFuVTX7vhLlpfj/IGWlqY= -golang.org/x/term v0.39.0/go.mod h1:yxzUCTP/U+FzoxfdKmLaA0RV1WgE0VY7hXBwKtY/4ww= +golang.org/x/term v0.40.0 h1:36e4zGLqU4yhjlmxEaagx2KuYbJq3EwY8K943ZsHcvg= +golang.org/x/term v0.40.0/go.mod h1:w2P8uVp06p2iyKKuvXIm7N/y0UCRt3UfJTfZ7oOpglM= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ= @@ -468,8 +478,8 @@ golang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE= golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU= golang.org/x/text v0.15.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU= golang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ= -golang.org/x/text v0.33.0 h1:B3njUFyqtHDUI5jMn1YIr5B0IE2U0qck04r6d4KPAxE= -golang.org/x/text v0.33.0/go.mod h1:LuMebE6+rBincTi9+xWTY8TztLzKHc/9C1uBCG27+q8= +golang.org/x/text v0.34.0 h1:oL/Qq0Kdaqxa1KbNeMKwQq0reLCCaFtqu2eNuSeNHbk= +golang.org/x/text v0.34.0/go.mod h1:homfLqTYRFyVYemLBFl5GgL/DWEiH5wcsQ5gSh1yziA= golang.org/x/time v0.12.0 h1:ScB/8o8olJvc+CQPWrK3fPZNfh7qgwCrY0zJmoEQLSE= golang.org/x/time v0.12.0/go.mod h1:CDIdPxbZBQxdj6cxyCIdrNogrJKMJ7pr37NYpMcMDSg= golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= @@ -478,8 +488,8 @@ golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58= golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk= -golang.org/x/tools v0.41.0 h1:a9b8iMweWG+S0OBnlU36rzLp20z1Rp10w+IY2czHTQc= -golang.org/x/tools v0.41.0/go.mod h1:XSY6eDqxVNiYgezAVqqCeihT4j1U2CCsqvH3WhQpnlg= +golang.org/x/tools v0.42.0 h1:uNgphsn75Tdz5Ji2q36v/nsFSfR/9BRFvqhGBaJGd5k= +golang.org/x/tools v0.42.0/go.mod h1:Ma6lCIwGZvHK6XtgbswSoWroEkhugApmsXyrUmBhfr0= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= gonum.org/v1/gonum v0.16.0 h1:5+ul4Swaf3ESvrOnidPp4GZbzf0mxVQpDCYUQE7OJfk= @@ -537,6 +547,34 @@ k8s.io/kubectl v0.35.2 h1:aSmqhSOfsoG9NR5oR8OD5eMKpLN9x8oncxfqLHbJJII= k8s.io/kubectl v0.35.2/go.mod h1:+OJC779UsDJGxNPbHxCwvb4e4w9Eh62v/DNYU2TlsyM= k8s.io/utils v0.0.0-20251002143259-bc988d571ff4 h1:SjGebBtkBqHFOli+05xYbK8YF1Dzkbzn+gDM4X9T4Ck= k8s.io/utils v0.0.0-20251002143259-bc988d571ff4/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0= +modernc.org/cc/v4 v4.28.2 h1:3tQ0lf2ADtoby2EtSP+J7IE2SHwEJdP8ioR59wx7XpY= +modernc.org/cc/v4 v4.28.2/go.mod h1:OnovgIhbbMXMu1aISnJ0wvVD1KnW+cAUJkIrAWh+kVI= +modernc.org/ccgo/v4 v4.34.0 h1:yRLPFZieg532OT4rp4JFNIVcquwalMX26G95WQDqwCQ= +modernc.org/ccgo/v4 v4.34.0/go.mod h1:AS5WYMyBakQ+fhsHhtP8mWB82KTGPkNNJDGfGQCe0/A= +modernc.org/fileutil v1.4.0 h1:j6ZzNTftVS054gi281TyLjHPp6CPHr2KCxEXjEbD6SM= +modernc.org/fileutil v1.4.0/go.mod h1:EqdKFDxiByqxLk8ozOxObDSfcVOv/54xDs/DUHdvCUU= +modernc.org/gc/v2 v2.6.5 h1:nyqdV8q46KvTpZlsw66kWqwXRHdjIlJOhG6kxiV/9xI= +modernc.org/gc/v2 v2.6.5/go.mod h1:YgIahr1ypgfe7chRuJi2gD7DBQiKSLMPgBQe9oIiito= +modernc.org/gc/v3 v3.1.2 h1:ZtDCnhonXSZexk/AYsegNRV1lJGgaNZJuKjJSWKyEqo= +modernc.org/gc/v3 v3.1.2/go.mod h1:HFK/6AGESC7Ex+EZJhJ2Gni6cTaYpSMmU/cT9RmlfYY= +modernc.org/goabi0 v0.2.0 h1:HvEowk7LxcPd0eq6mVOAEMai46V+i7Jrj13t4AzuNks= +modernc.org/goabi0 v0.2.0/go.mod h1:CEFRnnJhKvWT1c1JTI3Avm+tgOWbkOu5oPA8eH8LnMI= +modernc.org/libc v1.72.3 h1:ZnDF4tXn4NBXFutMMQC4vtbTFSXhhKzR73fv0beZEAU= +modernc.org/libc v1.72.3/go.mod h1:dn0dZNnnn1clLyvRxLxYExxiKRZIRENOfqQ8XEeg4Qs= +modernc.org/mathutil v1.7.1 h1:GCZVGXdaN8gTqB1Mf/usp1Y/hSqgI2vAGGP4jZMCxOU= +modernc.org/mathutil v1.7.1/go.mod h1:4p5IwJITfppl0G4sUEDtCr4DthTaT47/N3aT6MhfgJg= +modernc.org/memory v1.11.0 h1:o4QC8aMQzmcwCK3t3Ux/ZHmwFPzE6hf2Y5LbkRs+hbI= +modernc.org/memory v1.11.0/go.mod h1:/JP4VbVC+K5sU2wZi9bHoq2MAkCnrt2r98UGeSK7Mjw= +modernc.org/opt v0.2.0 h1:tGyef5ApycA7FSEOMraay9SaTk5zmbx7Tu+cJs4QKZg= +modernc.org/opt v0.2.0/go.mod h1:03fq9lsNfvkYSfxrfUhZCWPk1lm4cq4N+Bh//bEtgns= +modernc.org/sortutil v1.2.1 h1:+xyoGf15mM3NMlPDnFqrteY07klSFxLElE2PVuWIJ7w= +modernc.org/sortutil v1.2.1/go.mod h1:7ZI3a3REbai7gzCLcotuw9AC4VZVpYMjDzETGsSMqJE= +modernc.org/sqlite v1.52.0 h1:p4dhYh2tXZCiyaqHwRVJDjIGKWyXayiQpThxgDzJaxo= +modernc.org/sqlite v1.52.0/go.mod h1:tcNzv5p84E0skkmJn038y+hWJbLQXQqEnQfeh5r2JLM= +modernc.org/strutil v1.2.1 h1:UneZBkQA+DX2Rp35KcM69cSsNES9ly8mQWD71HKlOA0= +modernc.org/strutil v1.2.1/go.mod h1:EHkiggD70koQxjVdSBM3JKM7k6L0FbGE5eymy9i3B9A= +modernc.org/token v1.1.0 h1:Xl7Ap9dKaEs5kLoOQeQmPWevfnk/DM5qcLcYlA8ys6Y= +modernc.org/token v1.1.0/go.mod h1:UGzOrNV1mAFSEB63lOFHIpNRUVMvYTc6yu1SMY/XTDM= sigs.k8s.io/controller-runtime v0.21.0 h1:CYfjpEuicjUecRk+KAeyYh+ouUBn4llGyDYytIGcJS8= sigs.k8s.io/controller-runtime v0.21.0/go.mod h1:OSg14+F65eWqIu4DceX7k/+QRAbTTvxeQSNSOQpukWM= sigs.k8s.io/e2e-framework v0.6.0 h1:p7hFzHnLKO7eNsWGI2AbC1Mo2IYxidg49BiT4njxkrM= diff --git a/pkg/dashboard/README.md b/pkg/dashboard/README.md new file mode 100644 index 0000000000..89042d799f --- /dev/null +++ b/pkg/dashboard/README.md @@ -0,0 +1,307 @@ +# osde2e Dashboard + +**JIRA**: SDCICD-1823 + +A web dashboard for monitoring osde2e operations across environments, providing visibility into cluster reserves, usage metrics, and test results. + +## Features + +- **Cluster Reserve Tracking**: View all reserved clusters from OCM with status, expiration, and availability +- **Cluster Usage Metrics**: Aggregate cluster usage by environment, state, and cloud provider +- **Test Results**: Browse recent test executions from S3 with pass/fail status and logs +- **REST API**: JSON endpoints for programmatic access to all data +- **Static Snapshots**: On-demand data retrieval (no polling/websockets) + +## Architecture + +### Components + +``` +pkg/dashboard/ +├── models/ # Data models (ClusterReserve, TestResult, etc.) +├── config/ # Dashboard configuration (reuses common config) +├── collectors/ # Data collectors for OCM and S3 +│ ├── reserves.go # OCM cluster reserve collector +│ ├── usage.go # OCM cluster usage aggregator +│ └── s3tests.go # S3 test results parser +├── server/ # HTTP server and routing +│ └── server.go # Main server with API handlers +├── handlers/ # HTTP handlers and utilities +│ └── utils.go # Helper functions +├── templates/ # HTML templates (TODO) +└── docs/ # API documentation (TODO) + +cmd/osde2e/dashboard/ +└── cmd.go # CLI command with flags +``` + +### Data Sources + +1. **OCM API** (via existing `ocmprovider.OCMProvider`) + - Cluster reserves with `Availability=reserved` + - Cluster properties and metadata + - State tracking (ready, installing, pending) + +2. **S3 Bucket** `osde2e-logs` (via existing `aws.CcsAwsSession`) + - Path: `test-results////` + - JUnit XML test results + - Test output logs + +## Usage + +### Start the Dashboard + +```bash +# Basic usage (uses defaults) +osde2e dashboard + +# Custom port +osde2e dashboard --port 9000 + +# Filter by environment +osde2e dashboard --environment production + +# Limit test results +osde2e dashboard --max-results 50 + +# With configuration +osde2e dashboard --configs prod --secret-locations /path/to/secrets +``` + +### Required Environment Variables + +```bash +# OCM Configuration +export OCM_CONFIG=/path/to/ocm.json + +# AWS Configuration (for S3 access) +export AWS_ACCESS_KEY_ID=your_key +export AWS_SECRET_ACCESS_KEY=your_secret +export LOG_BUCKET=osde2e-logs # Optional, defaults to osde2e-logs +``` + +### API Endpoints + +All endpoints return JSON responses. + +#### Dashboard Overview +``` +GET /api/v1/overview +``` +Returns aggregated dashboard data including reserves, usage, and recent tests. + +#### Cluster Reserves +``` +GET /api/v1/reserves +``` +Lists all reserved clusters from OCM. + +Response: +```json +{ + "success": true, + "data": [ + { + "id": "cluster-123", + "name": "osde2e-abc", + "state": "ready", + "availability": "reserved", + "version": "openshift-v4.14.0", + "region": "us-east-1", + "cloud_provider": "aws", + "created_at": "2026-04-30T10:00:00Z", + "expires_at": "2026-05-01T10:00:00Z", + "product": "rosa" + } + ] +} +``` + +#### Cluster Usage +``` +GET /api/v1/usage +GET /api/v1/usage?environment=production +``` +Returns cluster usage metrics aggregated by environment. + +Response: +```json +{ + "success": true, + "data": [ + { + "environment": "production", + "total_clusters": 25, + "by_state": { + "ready": 20, + "installing": 3, + "pending": 2 + }, + "by_availability": { + "reserved": 10, + "claimed": 8, + "used": 7 + }, + "last_updated": "2026-04-30T12:00:00Z" + } + ] +} +``` + +#### Test Results +``` +GET /api/v1/tests +GET /api/v1/tests/{job-id} +``` +Lists recent test results or retrieves a specific test by job ID. + +Response: +```json +{ + "success": true, + "data": [ + { + "job_id": "abc123", + "job_name": "periodic-ci-openshift-osde2e", + "component": "osd-example-operator", + "date": "2026-04-30", + "status": "passed", + "total_tests": 50, + "passed_tests": 48, + "failed_tests": 2, + "skipped_tests": 0, + "duration_seconds": 1234.5, + "s3_path": "test-results/osd-example-operator/2026-04-30/abc123", + "log_url": "https://s3.amazonaws.com/...", + "junit_xml_url": "https://s3.amazonaws.com/...", + "timestamp": "2026-04-30T11:30:00Z" + } + ] +} +``` + +#### Health Check +``` +GET /health +``` +Returns server health status. + +Response: +```json +{ + "status": "ok", + "timestamp": "2026-04-30T12:00:00Z", + "ocm_connected": true, + "s3_connected": true +} +``` + +## Configuration + +The dashboard reuses existing osde2e configuration: + +| Config Key | Environment Variable | Default | Description | +|------------|---------------------|---------|-------------| +| `dashboard.port` | - | `8080` | HTTP server port | +| `dashboard.environment` | - | `all` | Filter environment | +| `dashboard.maxTestResults` | - | `100` | Max test results to return | +| `tests.logBucket` | `LOG_BUCKET` | `osde2e-logs` | S3 bucket for test results | +| `config.aws.region` | `AWS_REGION` | `us-east-1` | S3 bucket region | +| `ocmConfig` | `OCM_CONFIG` | - | Path to OCM config file | + +## Implementation Status + +### Completed ✅ +- [x] Data models and types +- [x] Configuration management (reuses common config) +- [x] OCM cluster reserve collector +- [x] OCM cluster usage collector +- [x] S3 test results collector (with JUnit XML parsing) +- [x] HTTP server with routing +- [x] REST API handlers +- [x] Dashboard command (CLI) +- [x] Integration with main osde2e command + +### TODO 🚧 +- [ ] HTML templates for web UI +- [ ] CSS styling for dashboard pages +- [ ] Unit tests for collectors +- [ ] Unit tests for handlers +- [ ] Integration tests +- [ ] Build verification +- [ ] Deployment documentation + +## Development + +### Project Structure + +The dashboard follows osde2e patterns: +- Reuses existing AWS and OCM connections +- Uses viper for configuration +- Follows cobra command structure +- Integrates with existing providers + +### Adding New Features + +1. **New Data Source**: Add collector in `collectors/` +2. **New API Endpoint**: Add handler in `server/server.go` +3. **New Configuration**: Add to `config/config.go` +4. **New Model**: Add to `models/types.go` + +### Testing + +```bash +# Run unit tests (when implemented) +go test ./pkg/dashboard/... + +# Run with test configuration +osde2e dashboard --configs test --port 8080 + +# Test API endpoints +curl http://localhost:8080/api/v1/reserves +curl http://localhost:8080/api/v1/usage +curl http://localhost:8080/api/v1/tests +curl http://localhost:8080/health +``` + +## Next Steps + +1. **Build Verification**: Test compilation and fix any errors +2. **HTML Templates**: Create Go templates for web UI +3. **Testing**: Add comprehensive unit and integration tests +4. **Documentation**: Complete API documentation +5. **Deployment**: Add deployment instructions and examples + +## Contributing + +When adding new features: +1. Follow existing code patterns +2. Reuse common osde2e utilities +3. Add appropriate error handling +4. Update this README +5. Add tests for new functionality + +## Troubleshooting + +### OCM Connection Issues +``` +Warning: OCM_CONFIG not set. OCM features may not work. +``` +Solution: Set `OCM_CONFIG` environment variable to your ocm.json path. + +### S3 Access Issues +``` +Warning: LOG_BUCKET not set. S3 test results will not be available. +``` +Solution: Set `LOG_BUCKET` and AWS credentials. + +### No Data Returned +Check that: +- OCM config is valid and accessible +- AWS credentials have S3 read access +- Clusters exist with `MadeByOSDe2e=true` property +- Test results exist in S3 bucket + +## License + +Same as osde2e project. \ No newline at end of file diff --git a/pkg/dashboard/collectors/deliverables.go b/pkg/dashboard/collectors/deliverables.go new file mode 100644 index 0000000000..a65dbccbdb --- /dev/null +++ b/pkg/dashboard/collectors/deliverables.go @@ -0,0 +1,523 @@ +package collectors + +import ( + "fmt" + "io" + "log" + "regexp" + "sort" + "strings" + "sync" + "time" + + "github.com/aws/aws-sdk-go/aws" + awssession "github.com/aws/aws-sdk-go/aws/session" + "github.com/aws/aws-sdk-go/service/s3" + "github.com/openshift/osde2e/pkg/dashboard/models" +) + +const downloadWorkers = 20 + +// versionRegex matches semver tags (v1.2.3) or short git SHAs (7-10 hex chars) +var versionRegex = regexp.MustCompile(`^(v\d+(\.\d+)*|[0-9a-f]{7,10})$`) + +var knownEnvSuffixes = []string{"integration", "stage", "prod", "int"} + +// DeliverableCollector scans S3 for operator test results grouped by name, version, and environment. +type DeliverableCollector struct { + s3Client *s3.S3 + bucket string + region string + lookbackDays int +} + +// NewDeliverableCollector creates a new collector using the standard AWS credential chain +// (env vars → ~/.aws/credentials → IAM role), independent of the osde2e viper config. +func NewDeliverableCollector(bucket, region string, lookbackDays int) (*DeliverableCollector, error) { + sess, err := awssession.NewSession(aws.NewConfig().WithRegion(region)) + if err != nil { + return nil, fmt.Errorf("failed to create AWS session: %w", err) + } + + s3Client := s3.New(sess) + + if lookbackDays <= 0 { + lookbackDays = 30 + } + + return &DeliverableCollector{ + s3Client: s3Client, + bucket: bucket, + region: region, + lookbackDays: lookbackDays, + }, nil +} + +// S3Client returns the underlying S3 client and bucket name, used by the server's S3 proxy handler. +func (c *DeliverableCollector) S3Client() (*s3.S3, string) { return c.s3Client, c.bucket } + +// parseComponentPath splits an S3 component string into operator name, version, and environment. +func parseComponentPath(component string) (name, version, env string) { + tokens := strings.Split(component, "-") + + env = "unknown" + if len(tokens) > 0 { + last := tokens[len(tokens)-1] + for _, suffix := range knownEnvSuffixes { + if strings.EqualFold(last, suffix) { + env = strings.ToLower(last) + tokens = tokens[:len(tokens)-1] + break + } + } + } + + version = "unknown" + versionIdx := -1 + for i := len(tokens) - 1; i >= 0; i-- { + if versionRegex.MatchString(tokens[i]) { + version = tokens[i] + versionIdx = i + break + } + } + + if versionIdx > 0 { + name = strings.Join(tokens[:versionIdx], "-") + } else if versionIdx == 0 { + name = "unknown" + } else { + name = strings.Join(tokens, "-") + } + + if name == "" { + name = "unknown" + } + + return name, version, env +} + +// candidate holds a JUnit key identified during listing, before downloading. +type candidate struct { + key string + component string + dateStr string + jobID string + modified time.Time // S3 LastModified, used to pick the newest per group +} + +// downloadResult is the outcome of fetching and parsing one candidate. +type downloadResult struct { + name string + version string + env string + jobID string + s3Dir string + key string + suite *JUnitTestSuite + ts time.Time +} + +// CollectDeliverables scans S3 for junit XML files within the lookback window, +// groups them by operator name + version, and returns the latest result per environment. +func (c *DeliverableCollector) CollectDeliverables() ([]models.DeliverableStatus, error) { + cutoff := time.Now().UTC().AddDate(0, 0, -c.lookbackDays) + + // Phase 1: list all matching keys, deduplicate to newest per (name, version, env). + // S3 listing is cheap; downloading is not. We only download one file per group. + type groupKey struct{ name, version, env string } + newestByGroup := make(map[groupKey]*candidate) + + input := &s3.ListObjectsV2Input{ + Bucket: aws.String(c.bucket), + Prefix: aws.String("test-results/"), + } + + err := c.s3Client.ListObjectsV2Pages(input, func(page *s3.ListObjectsV2Output, _ bool) bool { + for _, obj := range page.Contents { + key := aws.StringValue(obj.Key) + if !strings.HasSuffix(key, ".xml") || !strings.Contains(key, "junit") { + continue + } + + // Format: test-results//// + parts := strings.SplitN(key, "/", 5) + if len(parts) < 5 { + continue + } + + dateStr := parts[2] + keyDate, err := time.Parse("2006-01-02", dateStr) + if err != nil || keyDate.Before(cutoff) { + continue + } + + component := parts[1] + name, version, env := parseComponentPath(component) + gk := groupKey{name, version, env} + + modified := aws.TimeValue(obj.LastModified) + existing, seen := newestByGroup[gk] + if !seen || modified.After(existing.modified) { + newestByGroup[gk] = &candidate{ + key: key, + component: component, + dateStr: dateStr, + jobID: parts[3], + modified: modified, + } + } + } + return true + }) + if err != nil { + return nil, fmt.Errorf("failed to list S3 objects: %w", err) + } + + log.Printf("Deliverable collector: %d unique (name, version, env) groups to download", len(newestByGroup)) + + // Phase 2: fan out downloads with a worker pool. + candidates := make([]*candidate, 0, len(newestByGroup)) + groupKeys := make([]groupKey, 0, len(newestByGroup)) + for gk, cand := range newestByGroup { + candidates = append(candidates, cand) + groupKeys = append(groupKeys, gk) + } + + results := make([]*downloadResult, len(candidates)) + var wg sync.WaitGroup + sem := make(chan struct{}, downloadWorkers) + + for i, cand := range candidates { + wg.Add(1) + go func(i int, cand *candidate, gk groupKey) { + defer wg.Done() + sem <- struct{}{} + defer func() { <-sem }() + + suite, ts, err := c.downloadAndParseJUnit(cand.key) + if err != nil { + log.Printf("Warning: skipping %s: %v", cand.key, err) + return + } + + parts := strings.SplitN(cand.key, "/", 5) + s3Dir := strings.Join(parts[:4], "/") + + _, version, _ := parseComponentPath(cand.component) + + env := gk.env + if env == "unknown" { + if detected := c.fetchEnvFromLog(gk.name, parts[2], parts[3]); detected != "" { + env = detected + } + } + + results[i] = &downloadResult{ + name: gk.name, + version: version, + env: env, + jobID: cand.jobID, + s3Dir: s3Dir, + key: cand.key, + suite: suite, + ts: ts, + } + }(i, cand, groupKeys[i]) + } + wg.Wait() + + // Phase 3: build the index. + index := make(map[string]*models.DeliverableStatus) + for _, r := range results { + if r == nil { + continue + } + + status := suiteStatus(r.suite) + logURL := s3URL(c.bucket, r.s3Dir+"/test_output.log") + junitURL := junitURL(c.bucket, r.key) + + indexKey := r.name + op, exists := index[indexKey] + if !exists { + op = &models.DeliverableStatus{ + Name: r.name, + Version: r.version, + Results: make(map[string]*models.EnvironmentResult), + } + index[indexKey] = op + } + + failedTests := extractFailedTests(r.suite) + + op.Results[r.env] = &models.EnvironmentResult{ + Status: status, + Version: r.version, + Total: r.suite.Tests, + Passed: r.suite.Tests - r.suite.Failures - r.suite.Errors - r.suite.Skipped, + Failed: r.suite.Failures, + Skipped: r.suite.Skipped, + Errors: r.suite.Errors, + LastRun: r.ts, + JobID: r.jobID, + LogURL: logURL, + JUnitURL: junitURL, + FailedTests: failedTests, + } + + if r.ts.After(op.LastUpdated) { + op.LastUpdated = r.ts + } + } + + result := make([]models.DeliverableStatus, 0, len(index)) + for _, op := range index { + result = append(result, *op) + } + sort.Slice(result, func(i, j int) bool { + if result[i].Name != result[j].Name { + return result[i].Name < result[j].Name + } + return result[i].Version < result[j].Version + }) + + log.Printf("Collected deliverable status for %d operator+version combinations", len(result)) + return result, nil +} + +// adHocImageRegex extracts the image tag from AdHocTestImages in two formats: +// 1. "Successfully added property[AdHocTestImages] - quay.io/.../operator-e2e:c7fabd7" +// 2. "--properties AdHocTestImages:quay.io/.../operator-e2e:ec3ce7b" (rosa CLI args) +var adHocImageRegex = regexp.MustCompile(`AdHocTestImages[:\]] ?-? ?\S+:(\S+?)[ "]`) + +// fetchMetaFromLog reads test_output.log and extracts both the environment +// ("Will load config ") and the image tag from the AdHocTestImages property line. +func (c *DeliverableCollector) fetchMetaFromLog(name, date, jobID string) (env, version string) { + logKey := fmt.Sprintf("test-results/%s/%s/%s/test_output.log", name, date, jobID) + output, err := c.s3Client.GetObject(&s3.GetObjectInput{ + Bucket: aws.String(c.bucket), + Key: aws.String(logKey), + }) + if err != nil { + return "", "" + } + defer output.Body.Close() + + buf := make([]byte, 16384) // 16KB — enough for the header lines + n, _ := output.Body.Read(buf) + content := string(buf[:n]) + + for _, e := range []string{"stage", "prod", "int"} { + if strings.Contains(content, "Will load config "+e) { + env = e + break + } + } + + if m := adHocImageRegex.FindStringSubmatch(content); len(m) == 2 { + version = strings.TrimSpace(m[1]) + } + + return env, version +} + +// fetchEnvFromLog is kept for callers that only need the environment. +func (c *DeliverableCollector) fetchEnvFromLog(name, date, jobID string) string { + env, _ := c.fetchMetaFromLog(name, date, jobID) + return env +} + +// downloadAndParseJUnit fetches and parses a JUnit XML from S3. +func (c *DeliverableCollector) downloadAndParseJUnit(key string) (*JUnitTestSuite, time.Time, error) { + output, err := c.s3Client.GetObject(&s3.GetObjectInput{ + Bucket: aws.String(c.bucket), + Key: aws.String(key), + }) + if err != nil { + return nil, time.Time{}, fmt.Errorf("GetObject failed: %w", err) + } + defer output.Body.Close() + + data, err := io.ReadAll(output.Body) + if err != nil { + return nil, time.Time{}, fmt.Errorf("read failed: %w", err) + } + + suite, err := parseJUnitData(data) + if err != nil { + return nil, time.Time{}, err + } + + ts := parseTimestamp(suite.Timestamp) + + return suite, ts, nil +} + +// extractFailedTests pulls failed/errored test case names and messages from a suite. +func extractFailedTests(suite *JUnitTestSuite) []models.FailedTestCase { + var out []models.FailedTestCase + for _, tc := range suite.TestCases { + var msg string + if tc.Failure != nil { + msg = *tc.Failure + } else if tc.Error != nil { + msg = *tc.Error + } else { + continue + } + if len(msg) > 600 { + msg = msg[:600] + "…" + } + out = append(out, models.FailedTestCase{Name: tc.Name, Message: msg}) + } + return out +} + +// CollectPipelineHistory scans all S3 runs for a named operator and returns every +// (version, env, date, jobID) tuple found, sorted newest first. +func (c *DeliverableCollector) CollectPipelineHistory(operatorName string) (*models.PipelineHistory, error) { + prefix := "test-results/" + + type runKey struct { + component string + dateStr string + jobID string + } + seen := make(map[runKey]bool) + var candidates []runKey + + input := &s3.ListObjectsV2Input{ + Bucket: aws.String(c.bucket), + Prefix: aws.String(prefix), + } + + err := c.s3Client.ListObjectsV2Pages(input, func(page *s3.ListObjectsV2Output, _ bool) bool { + for _, obj := range page.Contents { + key := aws.StringValue(obj.Key) + if !strings.HasSuffix(key, ".xml") || !strings.Contains(key, "junit") { + continue + } + parts := strings.SplitN(key, "/", 5) + if len(parts) < 5 { + continue + } + component := parts[1] + name, _, _ := parseComponentPath(component) + if name != operatorName { + continue + } + rk := runKey{component: component, dateStr: parts[2], jobID: parts[3]} + if !seen[rk] { + seen[rk] = true + candidates = append(candidates, rk) + } + } + return true + }) + if err != nil { + return nil, fmt.Errorf("failed to list S3 objects: %w", err) + } + + // Fan-out: download each unique run in parallel + type rawRun struct { + version string + env string + dateStr string + jobID string + s3Dir string + key string + suite *JUnitTestSuite + ts time.Time + } + + rawRuns := make([]*rawRun, len(candidates)) + var wg sync.WaitGroup + sem := make(chan struct{}, downloadWorkers) + + for i, rk := range candidates { + wg.Add(1) + go func(i int, rk runKey) { + defer wg.Done() + sem <- struct{}{} + defer func() { <-sem }() + + // Find the JUnit XML key for this run + listOut, err := c.s3Client.ListObjectsV2(&s3.ListObjectsV2Input{ + Bucket: aws.String(c.bucket), + Prefix: aws.String(fmt.Sprintf("test-results/%s/%s/%s/", rk.component, rk.dateStr, rk.jobID)), + }) + if err != nil { + return + } + var junitKey string + for _, obj := range listOut.Contents { + k := aws.StringValue(obj.Key) + if strings.HasSuffix(k, ".xml") && strings.Contains(k, "junit") { + junitKey = k + break + } + } + if junitKey == "" { + return + } + + suite, ts, err := c.downloadAndParseJUnit(junitKey) + if err != nil { + log.Printf("Warning: history skip %s: %v", junitKey, err) + return + } + + _, version, env := parseComponentPath(rk.component) + if env == "unknown" { + if detected := c.fetchEnvFromLog(operatorName, rk.dateStr, rk.jobID); detected != "" { + env = detected + } + } + + s3Dir := fmt.Sprintf("test-results/%s/%s/%s", rk.component, rk.dateStr, rk.jobID) + rawRuns[i] = &rawRun{ + version: version, + env: env, + dateStr: rk.dateStr, + jobID: rk.jobID, + s3Dir: s3Dir, + key: junitKey, + suite: suite, + ts: ts, + } + }(i, rk) + } + wg.Wait() + + var runs []models.PipelineRun + for _, r := range rawRuns { + if r == nil { + continue + } + runs = append(runs, models.PipelineRun{ + Version: r.version, + Env: r.env, + Status: suiteStatus(r.suite), + Date: r.dateStr, + JobID: r.jobID, + LastRun: r.ts, + LogURL: s3URL(c.bucket, r.s3Dir+"/test_output.log"), + JUnitURL: junitURL(c.bucket, r.key), + Failed: extractFailedTests(r.suite), + Total: r.suite.Tests, + Passed: r.suite.Tests - r.suite.Failures - r.suite.Errors - r.suite.Skipped, + }) + } + + // Sort newest first + sort.Slice(runs, func(i, j int) bool { + return runs[i].LastRun.After(runs[j].LastRun) + }) + + return &models.PipelineHistory{ + Name: operatorName, + Runs: runs, + }, nil +} + diff --git a/pkg/dashboard/collectors/helpers.go b/pkg/dashboard/collectors/helpers.go new file mode 100644 index 0000000000..382e3375c0 --- /dev/null +++ b/pkg/dashboard/collectors/helpers.go @@ -0,0 +1,39 @@ +package collectors + +import ( + "net/url" + "time" +) + +// suiteStatus returns "passed", "failed", or "error" based on a parsed JUnit suite. +func suiteStatus(suite *JUnitTestSuite) string { + if suite.Failures > 0 { + return "failed" + } + if suite.Errors > 0 { + return "error" + } + return "passed" +} + +// parseTimestamp parses a JUnit timestamp string, returning zero time on failure. +// Callers should apply their own fallback (e.g. S3 LastModified) for zero results. +func parseTimestamp(ts string) time.Time { + if t, err := time.Parse("2006-01-02T15:04:05", ts); err == nil { + return t + } + if t, err := time.Parse(time.RFC3339, ts); err == nil { + return t + } + return time.Time{} +} + +// s3URL returns a dashboard proxy URL that streams the S3 object through the server. +func s3URL(_, key string) string { + return "/dashboard/s3?key=" + url.QueryEscape(key) +} + +// junitURL returns a dashboard URL that fetches the JUnit XML from S3 and renders it as HTML. +func junitURL(_, key string) string { + return "/dashboard/junit?key=" + url.QueryEscape(key) +} diff --git a/pkg/dashboard/collectors/reserves.go b/pkg/dashboard/collectors/reserves.go new file mode 100644 index 0000000000..4375d5f671 --- /dev/null +++ b/pkg/dashboard/collectors/reserves.go @@ -0,0 +1,130 @@ +package collectors + +import ( + "fmt" + "log" + "time" + + v1 "github.com/openshift-online/ocm-sdk-go/clustersmgmt/v1" + "github.com/openshift/osde2e/pkg/common/clusterproperties" + "github.com/openshift/osde2e/pkg/common/providers/ocmprovider" + "github.com/openshift/osde2e/pkg/dashboard/models" +) + +// ReserveCollector collects cluster reserve information from one or more OCM environments. +type ReserveCollector struct { + providers map[string]*ocmprovider.OCMProvider +} + +// NewReserveCollector creates a new reserve collector for the given OCM environments. +func NewReserveCollector(envs ...string) (*ReserveCollector, error) { + providers := make(map[string]*ocmprovider.OCMProvider, len(envs)) + for _, env := range envs { + p, err := ocmprovider.NewWithEnv(env) + if err != nil { + log.Printf("Warning: could not create provider for environment %s: %v (skipping)", env, err) + continue + } + providers[env] = p + } + if len(providers) == 0 { + return nil, fmt.Errorf("could not connect to any OCM environment") + } + return &ReserveCollector{providers: providers}, nil +} + +// CollectReserves retrieves reserved clusters from all configured OCM environments. +func (c *ReserveCollector) CollectReserves() ([]models.ClusterReserve, error) { + query := fmt.Sprintf( + "properties.MadeByOSDe2e='true' AND properties.Availability like '%s%%'", + clusterproperties.Reserved, + ) + + var all []models.ClusterReserve + for env, p := range c.providers { + resp, err := p.GetConnection().ClustersMgmt().V1().Clusters().List(). + Search(query). + Size(500). + Send() + if err != nil { + if isAuthError(err) { + log.Printf("Info: skipping reserves for env %q (OCM account not available)", env) + } else { + log.Printf("Warning: failed to query reserved clusters for env %q: %v", env, err) + } + continue + } + resp.Items().Each(func(cluster *v1.Cluster) bool { + all = append(all, c.ocmClusterToReserve(cluster)) + return true + }) + } + + log.Printf("Collected %d reserved clusters from OCM", len(all)) + return all, nil +} + +// ocmClusterToReserve converts an OCM cluster to a ClusterReserve model +func (c *ReserveCollector) ocmClusterToReserve(cluster *v1.Cluster) models.ClusterReserve { + reserve := models.ClusterReserve{ + ID: cluster.ID(), + Name: cluster.Name(), + State: string(cluster.State()), + Version: cluster.Version().ID(), + Region: cluster.Region().ID(), + CloudProvider: cluster.CloudProvider().ID(), + CreatedAt: cluster.CreationTimestamp(), + ExpiresAt: cluster.ExpirationTimestamp(), + Product: cluster.Product().ID(), + Properties: make(map[string]string), + } + + // Extract availability from properties + if props, ok := cluster.GetProperties(); ok { + for k, v := range props { + reserve.Properties[k] = v + if k == clusterproperties.Availability { + reserve.Availability = v + } + } + } + + return reserve +} + +// CollectClustersPerEnv returns all osde2e clusters grouped by environment name. +func (c *ReserveCollector) CollectClustersPerEnv() (map[string][]models.ClusterReserve, error) { + result := make(map[string][]models.ClusterReserve) + for env, p := range c.providers { + resp, err := p.GetConnection().ClustersMgmt().V1().Clusters().List(). + Search("properties.MadeByOSDe2e='true'"). + Size(1000). + Send() + if err != nil { + if isAuthError(err) { + log.Printf("Info: skipping clusters for env %q (OCM account not available)", env) + } else { + log.Printf("Warning: failed to query clusters for env %q: %v", env, err) + } + continue + } + var clusters []models.ClusterReserve + resp.Items().Each(func(cluster *v1.Cluster) bool { + clusters = append(clusters, c.ocmClusterToReserve(cluster)) + return true + }) + result[env] = clusters + } + return result, nil +} + +// CountExpiringSoon counts clusters expiring within the given threshold +func (c *ReserveCollector) CountExpiringSoon(reserves []models.ClusterReserve, threshold time.Duration) int { + count := 0 + for _, r := range reserves { + if r.IsExpiringSoon(threshold) { + count++ + } + } + return count +} \ No newline at end of file diff --git a/pkg/dashboard/collectors/s3tests.go b/pkg/dashboard/collectors/s3tests.go new file mode 100644 index 0000000000..f41da29681 --- /dev/null +++ b/pkg/dashboard/collectors/s3tests.go @@ -0,0 +1,273 @@ +package collectors + +import ( + "encoding/xml" + "fmt" + "io" + "log" + "path" + "sort" + "strings" + + "github.com/aws/aws-sdk-go/aws" + "github.com/aws/aws-sdk-go/service/s3" + awscommon "github.com/openshift/osde2e/pkg/common/aws" + "github.com/openshift/osde2e/pkg/dashboard/models" +) + +// JUnitTestSuite represents a single element +type JUnitTestSuite struct { + XMLName xml.Name `xml:"testsuite"` + Name string `xml:"name,attr"` + Tests int `xml:"tests,attr"` + Failures int `xml:"failures,attr"` + Errors int `xml:"errors,attr"` + Skipped int `xml:"skipped,attr"` + Time float64 `xml:"time,attr"` + Timestamp string `xml:"timestamp,attr"` + TestCases []JUnitTestCase `xml:"testcase"` +} + +// jUnitTestSuites represents a wrapper (may contain multiple children) +type jUnitTestSuites struct { + XMLName xml.Name `xml:"testsuites"` + Tests int `xml:"tests,attr"` + Failures int `xml:"failures,attr"` + Errors int `xml:"errors,attr"` + Time float64 `xml:"time,attr"` + TestSuites []JUnitTestSuite `xml:"testsuite"` +} + +// JUnitTestCase represents a single test case +type JUnitTestCase struct { + Name string `xml:"name,attr"` + Classname string `xml:"classname,attr"` + Time float64 `xml:"time,attr"` + Failure *string `xml:"failure,omitempty"` + Error *string `xml:"error,omitempty"` + Skipped *string `xml:"skipped,omitempty"` +} + +// parseJUnitData parses raw JUnit XML bytes handling both and root elements. +// When the root is , suites are merged into a single JUnitTestSuite by summing counters +// and taking the timestamp from the first child suite. +func parseJUnitData(data []byte) (*JUnitTestSuite, error) { + // Peek at the root element name + type rootPeek struct { + XMLName xml.Name + } + var peek rootPeek + if err := xml.Unmarshal(data, &peek); err != nil { + return nil, fmt.Errorf("failed to peek XML root: %w", err) + } + + switch peek.XMLName.Local { + case "testsuite": + var suite JUnitTestSuite + if err := xml.Unmarshal(data, &suite); err != nil { + return nil, fmt.Errorf("failed to unmarshal : %w", err) + } + return &suite, nil + + case "testsuites": + var suites jUnitTestSuites + if err := xml.Unmarshal(data, &suites); err != nil { + return nil, fmt.Errorf("failed to unmarshal : %w", err) + } + // Merge all child suites into one + merged := &JUnitTestSuite{Name: "merged"} + for _, s := range suites.TestSuites { + merged.Tests += s.Tests + merged.Failures += s.Failures + merged.Errors += s.Errors + merged.Skipped += s.Skipped + merged.Time += s.Time + merged.TestCases = append(merged.TestCases, s.TestCases...) + if merged.Timestamp == "" && s.Timestamp != "" { + merged.Timestamp = s.Timestamp + merged.Name = s.Name + } + } + return merged, nil + + default: + return nil, fmt.Errorf("unexpected XML root element: <%s>", peek.XMLName.Local) + } +} + +// TestResultsCollector collects test results from S3 +type TestResultsCollector struct { + s3Client *s3.S3 + bucket string + region string +} + +// NewTestResultsCollector creates a new test results collector using existing AWS session +func NewTestResultsCollector(bucket, region string) (*TestResultsCollector, error) { + sess, err := awscommon.CcsAwsSession.GetSession() + if err != nil { + return nil, fmt.Errorf("failed to get AWS session: %w", err) + } + + s3Client := s3.New(sess, aws.NewConfig().WithRegion(region)) + + return &TestResultsCollector{ + s3Client: s3Client, + bucket: bucket, + region: region, + }, nil +} + +// CollectRecentTests retrieves recent test results from S3 +func (c *TestResultsCollector) CollectRecentTests(maxResults int) ([]models.TestResult, error) { + // List objects in the test-results/ prefix + prefix := "test-results/" + + input := &s3.ListObjectsV2Input{ + Bucket: aws.String(c.bucket), + Prefix: aws.String(prefix), + } + + var allResults []models.TestResult + resultsByJob := make(map[string]*models.TestResult) + + err := c.s3Client.ListObjectsV2Pages(input, func(page *s3.ListObjectsV2Output, lastPage bool) bool { + for _, obj := range page.Contents { + key := aws.StringValue(obj.Key) + + // Skip if not a JUnit XML file + if !strings.HasSuffix(key, ".xml") || !strings.Contains(key, "junit") { + continue + } + + // Parse the S3 key to extract metadata + // Format: test-results////junit*.xml + parts := strings.Split(key, "/") + if len(parts) < 4 { + continue + } + + component := parts[1] + date := parts[2] + jobID := parts[3] + + jobKey := fmt.Sprintf("%s-%s-%s", component, date, jobID) + + // Only process if we haven't seen this job yet + if _, exists := resultsByJob[jobKey]; !exists { + result, err := c.parseJUnitXML(key, component, date, jobID) + if err != nil { + log.Printf("Warning: failed to parse %s: %v", key, err) + continue + } + + resultsByJob[jobKey] = result + } + } + + // Stop if we have enough results + return len(resultsByJob) < maxResults + }) + + if err != nil { + return nil, fmt.Errorf("failed to list S3 objects: %w", err) + } + + // Convert map to slice + for _, result := range resultsByJob { + allResults = append(allResults, *result) + } + + // Sort by timestamp (most recent first) + sort.Slice(allResults, func(i, j int) bool { + return allResults[i].Timestamp.After(allResults[j].Timestamp) + }) + + // Limit results + if len(allResults) > maxResults { + allResults = allResults[:maxResults] + } + + log.Printf("Collected %d test results from S3", len(allResults)) + return allResults, nil +} + +// parseJUnitXML downloads and parses a JUnit XML file from S3 +func (c *TestResultsCollector) parseJUnitXML(key, component, date, jobID string) (*models.TestResult, error) { + // Download the file + output, err := c.s3Client.GetObject(&s3.GetObjectInput{ + Bucket: aws.String(c.bucket), + Key: aws.String(key), + }) + if err != nil { + return nil, fmt.Errorf("failed to download %s: %w", key, err) + } + defer output.Body.Close() + + // Parse XML + data, err := io.ReadAll(output.Body) + if err != nil { + return nil, fmt.Errorf("failed to read %s: %w", key, err) + } + + suite, err := parseJUnitData(data) + if err != nil { + return nil, err + } + + timestamp := parseTimestamp(suite.Timestamp) + + status := suiteStatus(suite) + + // Build per-test-case list + testCases := make([]models.TestCase, 0, len(suite.TestCases)) + for _, tc := range suite.TestCases { + tcStatus := "passed" + var msg string + if tc.Failure != nil { + tcStatus = "failed" + msg = *tc.Failure + } else if tc.Error != nil { + tcStatus = "error" + msg = *tc.Error + } else if tc.Skipped != nil { + tcStatus = "skipped" + msg = *tc.Skipped + } + // Trim long messages to 500 chars for the UI + if len(msg) > 500 { + msg = msg[:500] + "…" + } + testCases = append(testCases, models.TestCase{ + Name: tc.Name, + Duration: tc.Time, + Status: tcStatus, + Message: msg, + }) + } + + s3Path := path.Dir(key) + logURL := s3URL(c.bucket, path.Join(s3Path, "test_output.log")) + junitURL := junitURL(c.bucket, key) + + return &models.TestResult{ + JobID: jobID, + JobName: component, + Component: component, + Date: date, + Status: status, + TotalTests: suite.Tests, + PassedTests: suite.Tests - suite.Failures - suite.Errors - suite.Skipped, + FailedTests: suite.Failures, + ErrorTests: suite.Errors, + SkippedTests: suite.Skipped, + Duration: suite.Time, + S3Path: s3Path, + LogURL: logURL, + JUnitXMLURL: junitURL, + Timestamp: timestamp, + TestCases: testCases, + }, nil +} + + diff --git a/pkg/dashboard/collectors/sqs.go b/pkg/dashboard/collectors/sqs.go new file mode 100644 index 0000000000..e3410e70c6 --- /dev/null +++ b/pkg/dashboard/collectors/sqs.go @@ -0,0 +1,359 @@ +package collectors + +import ( + "context" + "encoding/json" + "fmt" + "io" + "log" + "regexp" + "strings" + "time" + + "github.com/aws/aws-sdk-go/aws" + "github.com/aws/aws-sdk-go/service/s3" + "github.com/aws/aws-sdk-go/service/sqs" + awscommon "github.com/openshift/osde2e/pkg/common/aws" + "github.com/openshift/osde2e/pkg/dashboard/models" + "github.com/openshift/osde2e/pkg/dashboard/store" + "gopkg.in/yaml.v3" +) + +// s3Event is the top-level SQS message body for S3 event notifications. +type s3Event struct { + Records []struct { + S3 struct { + Bucket struct{ Name string } `json:"bucket"` + Object struct{ Key string } `json:"object"` + } `json:"s3"` + } `json:"Records"` +} + +// SQSConsumer polls an SQS queue for S3 ObjectCreated events and writes +// parsed JUnit results into the Store. +type SQSConsumer struct { + sqsClient *sqs.SQS + opCollect *DeliverableCollector + store *store.Store + queueURL string + bucket string +} + +// NewSQSConsumer creates a new consumer. +func NewSQSConsumer(queueURL, bucket, region string, st *store.Store) (*SQSConsumer, error) { + opCollect, err := NewDeliverableCollector(bucket, region, 0) + if err != nil { + return nil, fmt.Errorf("create deliverable collector: %w", err) + } + + sess, err := awscommon.CcsAwsSession.GetSession() + if err != nil { + return nil, fmt.Errorf("get AWS session: %w", err) + } + + return &SQSConsumer{ + sqsClient: sqs.New(sess), + opCollect: opCollect, + store: st, + queueURL: queueURL, + bucket: bucket, + }, nil +} + +// Run starts a long-poll loop that processes messages until ctx is cancelled. +// Call in a goroutine: go consumer.Run(ctx) +func (c *SQSConsumer) Run(ctx context.Context) { + log.Printf("SQS consumer: started, queue=%s", c.queueURL) + for { + select { + case <-ctx.Done(): + log.Printf("SQS consumer: stopped") + return + default: + } + + msgs, err := c.sqsClient.ReceiveMessageWithContext(ctx, &sqs.ReceiveMessageInput{ + QueueUrl: aws.String(c.queueURL), + MaxNumberOfMessages: aws.Int64(10), + WaitTimeSeconds: aws.Int64(20), // long poll — blocks up to 20s if queue empty + VisibilityTimeout: aws.Int64(60), + }) + if err != nil { + if ctx.Err() != nil { + return + } + log.Printf("SQS consumer: receive error: %v — retrying in 10s", err) + select { + case <-time.After(10 * time.Second): + case <-ctx.Done(): + return + } + continue + } + + for _, msg := range msgs.Messages { + if err := c.processMessage(aws.StringValue(msg.Body)); err != nil { + log.Printf("SQS consumer: process error: %v", err) + // Leave on queue — will become visible again after VisibilityTimeout. + continue + } + _, _ = c.sqsClient.DeleteMessage(&sqs.DeleteMessageInput{ + QueueUrl: aws.String(c.queueURL), + ReceiptHandle: msg.ReceiptHandle, + }) + } + } +} + +// processMessage parses one SQS message body (direct S3 event or SNS-wrapped). +func (c *SQSConsumer) processMessage(body string) error { + // SNS wraps the S3 JSON event inside a "Message" string field. + var wrapper struct{ Message string } + raw := body + if err := json.Unmarshal([]byte(body), &wrapper); err == nil && wrapper.Message != "" { + raw = wrapper.Message + } + + var event s3Event + if err := json.Unmarshal([]byte(raw), &event); err != nil { + return fmt.Errorf("unmarshal S3 event: %w", err) + } + + var failed int + for _, rec := range event.Records { + if err := c.processKey(rec.S3.Bucket.Name, rec.S3.Object.Key); err != nil { + log.Printf("SQS consumer: skip %s: %v", rec.S3.Object.Key, err) + failed++ + } + } + if failed > 0 { + return fmt.Errorf("%d record(s) failed processing; message will be retried", failed) + } + return nil +} + +// processKey downloads, parses, and stores the result for a single S3 JUnit key. +// Expected key format: test-results////.xml +func (c *SQSConsumer) processKey(bucket, key string) error { + if !strings.HasSuffix(key, ".xml") || !strings.Contains(key, "junit") { + return nil + } + + parts := strings.SplitN(key, "/", 5) + if len(parts) < 5 { + return fmt.Errorf("unexpected key format: %s", key) + } + + component := parts[1] + dateStr := parts[2] + jobID := parts[3] + + name, version, env := parseComponentPath(component) + + // Always read the log to get env + image tag — these paths are unversioned + // so parseComponentPath returns "unknown" for both env and version. + logEnv, logVersion := c.opCollect.fetchMetaFromLog(name, dateStr, jobID) + if env == "unknown" && logEnv != "" { + env = logEnv + } + if version == "unknown" && logVersion != "" { + version = logVersion + } + + suite, ts, err := c.opCollect.downloadAndParseJUnit(key) + if err != nil { + return fmt.Errorf("parse junit %s: %w", key, err) + } + + status := suiteStatus(suite) + + s3Dir := strings.Join(parts[:4], "/") + + // Only fetch LLM analysis for failed runs — no point for passing ones. + var llm *models.LLMAnalysis + if status != "passed" { + llm = c.fetchLLMAnalysis(bucket, s3Dir) + } + + rec := store.RunRecord{ + Name: name, + Env: env, + Version: version, + Status: status, + Passed: suite.Tests - suite.Failures - suite.Errors - suite.Skipped, + Failed: suite.Failures + suite.Errors, + Total: suite.Tests, + JobID: jobID, + Date: dateStr, + LastRun: ts, + LogURL: s3URL(c.opCollect.bucket, s3Dir+"/test_output.log"), + JUnitURL: junitURL(c.opCollect.bucket, key), + FailedTests: extractFailedTests(suite), + LLMAnalysis: llm, + } + + if err := c.store.UpsertRun(rec); err != nil { + return fmt.Errorf("upsert: %w", err) + } + + log.Printf("SQS consumer: stored %s %s %s → %s", name, version, env, status) + return nil +} + +// summaryYAML mirrors the relevant fields of summary.yaml produced by the LLM analysis job. +type summaryYAML struct { + Response string `yaml:"response"` + Status string `yaml:"status"` +} + +// llmResponse is the JSON embedded in the response field (may be wrapped in ```json ... ```). +type llmResponse struct { + RootCause string `json:"root_cause"` + Recommendations []string `json:"recommendations"` +} + +// reJSONBlock strips ```json ... ``` markdown fences if present. +var reJSONBlock = regexp.MustCompile("(?s)```(?:json)?\\s*(\\{.*?\\})\\s*```") + +// fetchLLMAnalysis looks for a summary.yaml under the job's S3 prefix and parses it. +// It tries both known path patterns: +// 1. test-results////llm-analysis/summary.yaml +// 2. test-results////install/*/llm-analysis/summary.yaml +func (c *SQSConsumer) fetchLLMAnalysis(bucket, s3Dir string) *models.LLMAnalysis { + // Pattern 1: shallow path + candidates := []string{ + s3Dir + "/llm-analysis/summary.yaml", + } + + // Pattern 2: deep path — list install/* to find the e2e image subdirectory + listOut, err := c.opCollect.s3Client.ListObjectsV2(&s3.ListObjectsV2Input{ + Bucket: aws.String(bucket), + Prefix: aws.String(s3Dir + "/install/"), + }) + if err == nil { + for _, obj := range listOut.Contents { + key := aws.StringValue(obj.Key) + if strings.HasSuffix(key, "/llm-analysis/summary.yaml") { + candidates = append(candidates, key) + } + } + } + + for _, key := range candidates { + out, err := c.opCollect.s3Client.GetObject(&s3.GetObjectInput{ + Bucket: aws.String(bucket), + Key: aws.String(key), + }) + if err != nil { + continue + } + data, err := io.ReadAll(out.Body) + out.Body.Close() + if err != nil { + continue + } + + var sy summaryYAML + if err := yaml.Unmarshal(data, &sy); err != nil || sy.Response == "" { + continue + } + + // Extract the JSON — may be bare or wrapped in ```json ... ``` + raw := sy.Response + if m := reJSONBlock.FindStringSubmatch(raw); len(m) == 2 { + raw = m[1] + } + + var resp llmResponse + if err := json.Unmarshal([]byte(strings.TrimSpace(raw)), &resp); err != nil { + log.Printf("fetchLLMAnalysis: parse response JSON from %s: %v", key, err) + continue + } + + if resp.RootCause == "" { + continue + } + + return &models.LLMAnalysis{ + RootCause: resp.RootCause, + Recommendations: resp.Recommendations, + } + } + + return nil +} + +// Backfill scans all historical S3 objects and populates the store from scratch. +// Run once at first startup or when the DB is missing/corrupt. +func (c *SQSConsumer) Backfill() error { + log.Printf("Backfill: scanning s3://%s/test-results/ ...", c.bucket) + + // Collect unique (component, date, jobID) → best junit key + type runKey struct{ component, date, jobID string } + seen := make(map[runKey]string) + + err := c.opCollect.s3Client.ListObjectsV2Pages(&s3.ListObjectsV2Input{ + Bucket: aws.String(c.bucket), + Prefix: aws.String("test-results/"), + }, func(page *s3.ListObjectsV2Output, _ bool) bool { + for _, obj := range page.Contents { + key := aws.StringValue(obj.Key) + if !strings.HasSuffix(key, ".xml") || !strings.Contains(key, "junit") { + continue + } + parts := strings.SplitN(key, "/", 5) + if len(parts) < 5 { + continue + } + rk := runKey{parts[1], parts[2], parts[3]} + if _, exists := seen[rk]; !exists { + seen[rk] = key + } + } + return true + }) + if err != nil { + return fmt.Errorf("list S3: %w", err) + } + + log.Printf("Backfill: %d unique runs found — downloading in parallel...", len(seen)) + + // Fan out with the same worker pool size as the operator collector. + type work struct { + bucket, key string + } + jobs := make(chan work, len(seen)) + for _, key := range seen { + jobs <- work{c.bucket, key} + } + close(jobs) + + type result struct{ err error } + results := make(chan result, len(seen)) + + workers := downloadWorkers + if workers > len(seen) { + workers = len(seen) + } + for i := 0; i < workers; i++ { + go func() { + for j := range jobs { + err := c.processKey(j.bucket, j.key) + results <- result{err} + } + }() + } + + ok, failed := 0, 0 + for range seen { + r := <-results + if r.err != nil { + failed++ + } else { + ok++ + } + } + + log.Printf("Backfill: complete. ok=%d failed=%d", ok, failed) + return nil +} \ No newline at end of file diff --git a/pkg/dashboard/collectors/usage.go b/pkg/dashboard/collectors/usage.go new file mode 100644 index 0000000000..e21687dce5 --- /dev/null +++ b/pkg/dashboard/collectors/usage.go @@ -0,0 +1,154 @@ +package collectors + +import ( + "fmt" + "log" + "strings" + "sync" + "time" + + v1 "github.com/openshift-online/ocm-sdk-go/clustersmgmt/v1" + "github.com/openshift/osde2e/pkg/common/clusterproperties" + "github.com/openshift/osde2e/pkg/common/providers/ocmprovider" + "github.com/openshift/osde2e/pkg/dashboard/models" +) + +// UsageCollector collects cluster usage metrics from one or more OCM environments. +type UsageCollector struct { + // providers maps environment name → OCMProvider for that env + providers map[string]*ocmprovider.OCMProvider +} + +// NewUsageCollector creates a UsageCollector that queries the given OCM environments +// in parallel. Each env must be a valid OCM environment name ("stage", "int", "prod", etc.). +// Environments that fail to connect are skipped with a warning. +func NewUsageCollector(envs ...string) (*UsageCollector, error) { + if len(envs) == 0 { + envs = []string{"stage", "int"} + } + + providers := make(map[string]*ocmprovider.OCMProvider, len(envs)) + for _, env := range envs { + p, err := ocmprovider.NewWithEnv(env) + if err != nil { + log.Printf("Warning: could not create provider for environment %s: %v (skipping)", env, err) + continue + } + providers[env] = p + } + + if len(providers) == 0 { + return nil, fmt.Errorf("could not connect to any OCM environment") + } + + return &UsageCollector{providers: providers}, nil +} + +// CollectUsage queries all configured OCM environments in parallel and returns +// one ClusterUsage entry per environment. +func (c *UsageCollector) CollectUsage() ([]models.ClusterUsage, error) { + type result struct { + env string + usage *models.ClusterUsage + err error + } + + ch := make(chan result, len(c.providers)) + var wg sync.WaitGroup + + for env, provider := range c.providers { + wg.Add(1) + go func(env string, p *ocmprovider.OCMProvider) { + defer wg.Done() + usage, err := collectUsageForEnv(env, p) + ch <- result{env: env, usage: usage, err: err} + }(env, provider) + } + + wg.Wait() + close(ch) + + var usages []models.ClusterUsage + for r := range ch { + if r.err != nil { + if isAuthError(r.err) { + log.Printf("Info: skipping env %q (OCM account not available in this environment)", r.env) + } else { + log.Printf("Warning: failed to collect usage for env %q: %v", r.env, r.err) + } + continue + } + usages = append(usages, *r.usage) + } + + log.Printf("Collected usage metrics for %d environments", len(usages)) + return usages, nil +} + +// collectUsageForEnv queries a single OCM environment and returns its ClusterUsage. +func collectUsageForEnv(env string, provider *ocmprovider.OCMProvider) (*models.ClusterUsage, error) { + query := "properties.MadeByOSDe2e='true'" + + resp, err := provider.GetConnection().ClustersMgmt().V1().Clusters().List(). + Search(query). + Size(1000). + Send() + if err != nil { + return nil, fmt.Errorf("failed to query clusters: %w", err) + } + + usage := &models.ClusterUsage{ + Environment: env, + ByState: make(map[string]int), + ByAvailability: make(map[string]int), + ByCloudProvider: make(map[string]int), + ByVersion: make(map[string]int), + LastUpdated: time.Now(), + } + + resp.Items().Each(func(cluster *v1.Cluster) bool { + usage.TotalClusters++ + usage.ByState[string(cluster.State())]++ + usage.ByCloudProvider[cluster.CloudProvider().ID()]++ + usage.ByVersion[cluster.Version().ID()]++ + + if props, ok := cluster.GetProperties(); ok { + if avail, exists := props[clusterproperties.Availability]; exists { + usage.ByAvailability[avail]++ + } + } + + return true + }) + + return usage, nil +} + +// CollectUsageByEnvironment retrieves usage for a specific environment. +func (c *UsageCollector) CollectUsageByEnvironment(env string) (*models.ClusterUsage, error) { + p, ok := c.providers[env] + if !ok { + return &models.ClusterUsage{ + Environment: env, + ByState: make(map[string]int), + ByAvailability: make(map[string]int), + ByCloudProvider: make(map[string]int), + ByVersion: make(map[string]int), + LastUpdated: time.Now(), + }, nil + } + return collectUsageForEnv(env, p) +} + +// isAuthError returns true for OCM errors that indicate the token is not valid +// for a given environment (401, 403, 422 user-not-found). These are expected when +// running with a stage/int token against prod, and should not be surfaced as warnings. +func isAuthError(err error) bool { + if err == nil { + return false + } + msg := err.Error() + return strings.Contains(msg, "status is 401") || + strings.Contains(msg, "status is 403") || + (strings.Contains(msg, "status is 422") && strings.Contains(msg, "does not exist")) +} diff --git a/pkg/dashboard/config/config.go b/pkg/dashboard/config/config.go new file mode 100644 index 0000000000..9f815846d5 --- /dev/null +++ b/pkg/dashboard/config/config.go @@ -0,0 +1,111 @@ +package config + +import ( + "time" + + viper "github.com/openshift/osde2e/pkg/common/concurrentviper" + commonconfig "github.com/openshift/osde2e/pkg/common/config" +) + +// Dashboard configuration keys +const ( + // Port is the HTTP port the dashboard server listens on + Port = "dashboard.port" + + // Environment filters clusters by environment (stage, prod, integration, all) + Environment = "dashboard.environment" + + // RefreshInterval is how often to refresh data (in seconds) + RefreshInterval = "dashboard.refreshInterval" + + // ExpirationWarningThreshold is the duration before expiration to warn about + ExpirationWarningThreshold = "dashboard.expirationWarningThreshold" + + // MaxTestResults is the maximum number of test results to return + MaxTestResults = "dashboard.maxTestResults" + + // LookbackDays is the number of days of S3 data to scan for operator status + LookbackDays = "dashboard.lookbackDays" + + // SQSQueueURL is the URL of the SQS queue receiving S3 ObjectCreated events + SQSQueueURL = "dashboard.sqsQueueURL" + + // DBPath is the path to the SQLite database file + DBPath = "dashboard.dbPath" +) + +// Default values +const ( + DefaultPort = 8080 + DefaultEnvironment = "all" + DefaultRefreshInterval = 300 // 5 minutes + DefaultExpirationWarningThreshold = 2 * time.Hour + DefaultMaxTestResults = 100 + DefaultLookbackDays = 30 +) + +// Config holds dashboard configuration +type Config struct { + Port int + S3Bucket string // Reuses commonconfig.Tests.LogBucket + S3Region string // Reuses commonconfig.AWSRegion + OCMConfigPath string // Reuses commonconfig.OcmConfig + Environment string + RefreshInterval int + ExpirationWarningThreshold time.Duration + MaxTestResults int + LookbackDays int + SQSQueueURL string // SQS queue URL for S3 event notifications + DBPath string // Path to SQLite database file +} + +// LoadConfig loads dashboard configuration from viper +// Reuses existing AWS and OCM configuration from common config +func LoadConfig() *Config { + return &Config{ + Port: viper.GetInt(Port), + S3Bucket: viper.GetString(commonconfig.Tests.LogBucket), + S3Region: viper.GetString(commonconfig.AWSRegion), + OCMConfigPath: viper.GetString(commonconfig.OcmConfig), + Environment: viper.GetString(Environment), + RefreshInterval: viper.GetInt(RefreshInterval), + ExpirationWarningThreshold: viper.GetDuration(ExpirationWarningThreshold), + MaxTestResults: viper.GetInt(MaxTestResults), + LookbackDays: viper.GetInt(LookbackDays), + SQSQueueURL: viper.GetString(SQSQueueURL), + DBPath: viper.GetString(DBPath), + } +} + +// OCMEnvironments returns the list of OCM environments to query. +// "all" expands to stage + int + prod; a specific env returns just that one. +func (c *Config) OCMEnvironments() []string { + switch c.Environment { + case "all", "": + return []string{"stage", "int", "prod"} + case "integration": + return []string{"int"} + default: + return []string{c.Environment} + } +} + +// SetDefaults sets default configuration values +func SetDefaults() { + viper.SetDefault(Port, DefaultPort) + viper.SetDefault(Environment, DefaultEnvironment) + viper.SetDefault(RefreshInterval, DefaultRefreshInterval) + viper.SetDefault(ExpirationWarningThreshold, DefaultExpirationWarningThreshold) + viper.SetDefault(MaxTestResults, DefaultMaxTestResults) + viper.SetDefault(LookbackDays, DefaultLookbackDays) + + // Set defaults for S3 bucket if not already set + if viper.GetString(commonconfig.Tests.LogBucket) == "" { + viper.SetDefault(commonconfig.Tests.LogBucket, "osde2e-logs") + } + + // The log bucket lives in us-east-1; fall back to that when no region is configured + if viper.GetString(commonconfig.AWSRegion) == "" { + viper.SetDefault(commonconfig.AWSRegion, "us-east-1") + } +} diff --git a/pkg/dashboard/handlers/utils.go b/pkg/dashboard/handlers/utils.go new file mode 100644 index 0000000000..50ec7f4a38 --- /dev/null +++ b/pkg/dashboard/handlers/utils.go @@ -0,0 +1,8 @@ +package handlers + +import "time" + +// Now returns the current time - useful for testing with mocking +func Now() time.Time { + return time.Now() +} \ No newline at end of file diff --git a/pkg/dashboard/models/types.go b/pkg/dashboard/models/types.go new file mode 100644 index 0000000000..18b7bd0489 --- /dev/null +++ b/pkg/dashboard/models/types.go @@ -0,0 +1,206 @@ +package models + +import "time" + +// ClusterReserve represents a reserved cluster available for testing +type ClusterReserve struct { + ID string `json:"id"` + Name string `json:"name"` + State string `json:"state"` // ready, installing, pending + Availability string `json:"availability"` // reserved, claimed, used + Version string `json:"version"` + Region string `json:"region"` + CloudProvider string `json:"cloud_provider"` + CreatedAt time.Time `json:"created_at"` + ExpiresAt time.Time `json:"expires_at"` + Product string `json:"product"` // osd, rosa + Properties map[string]string `json:"properties,omitempty"` +} + +// IsExpiringSoon returns true if the cluster expires within the given duration. +// Returns false for zero or already-expired timestamps. +func (c *ClusterReserve) IsExpiringSoon(threshold time.Duration) bool { + if c.ExpiresAt.IsZero() { + return false + } + remaining := time.Until(c.ExpiresAt) + return remaining >= 0 && remaining < threshold +} + +// ExpiringSoon returns true if the cluster expires within 2 hours (for template use) +func (c *ClusterReserve) ExpiringSoon() bool { + return !c.ExpiresAt.IsZero() && time.Until(c.ExpiresAt) < 2*time.Hour +} + +// ClusterUsage represents aggregate cluster usage metrics +type ClusterUsage struct { + Environment string `json:"environment"` // stage, prod, integration + TotalClusters int `json:"total_clusters"` + ByState map[string]int `json:"by_state"` // ready: 5, installing: 2 + ByAvailability map[string]int `json:"by_availability"` // reserved: 3, claimed: 2, used: 1 + ByCloudProvider map[string]int `json:"by_cloud_provider,omitempty"` + ByVersion map[string]int `json:"by_version,omitempty"` + LastUpdated time.Time `json:"last_updated"` +} + +// TestCase holds a single test case result for rendering in the UI +type TestCase struct { + Name string `json:"name"` + Duration float64 `json:"duration_seconds"` + Status string `json:"status"` // passed, failed, error, skipped + Message string `json:"message,omitempty"` // failure/error/skip message +} + +// TestResult represents the outcome of a test execution +type TestResult struct { + JobID string `json:"job_id"` + JobName string `json:"job_name"` + Component string `json:"component"` + Date string `json:"date"` + Status string `json:"status"` // passed, failed, error, skipped + TotalTests int `json:"total_tests"` + PassedTests int `json:"passed_tests"` + FailedTests int `json:"failed_tests"` + SkippedTests int `json:"skipped_tests"` + ErrorTests int `json:"error_tests"` + Duration float64 `json:"duration_seconds"` + S3Path string `json:"s3_path"` + LogURL string `json:"log_url,omitempty"` + JUnitXMLURL string `json:"junit_xml_url,omitempty"` + Timestamp time.Time `json:"timestamp"` + TestCases []TestCase `json:"test_cases,omitempty"` +} + + +// DashboardOverview provides a high-level summary for the main dashboard view +type DashboardOverview struct { + TotalReservedClusters int `json:"total_reserved_clusters"` + ClustersExpiringSoon int `json:"clusters_expiring_soon"` + ActiveTests int `json:"active_tests"` + OverallSuccessRate float64 `json:"overall_success_rate"` + RecentTests []TestResult `json:"recent_tests"` + ClusterUsageSummary []ClusterUsage `json:"cluster_usage_summary"` + LastUpdated time.Time `json:"last_updated"` +} + + +// APIResponse is a generic wrapper for API responses +type APIResponse struct { + Success bool `json:"success"` + Data interface{} `json:"data,omitempty"` + Error string `json:"error,omitempty"` + Message string `json:"message,omitempty"` +} + +// FailedTestCase holds the name and failure message of a single failed test +type FailedTestCase struct { + Name string `json:"name"` + Message string `json:"message"` +} + +// LLMAnalysis holds the AI-generated root cause and recommendations from summary.yaml +type LLMAnalysis struct { + RootCause string `json:"root_cause"` + Recommendations []string `json:"recommendations"` +} + +// EnvironmentResult holds the latest test result for one operator+version in one environment +type EnvironmentResult struct { + Status string `json:"status"` // passed, failed, error + Version string `json:"version"` + Total int `json:"total"` + Passed int `json:"passed"` + Failed int `json:"failed"` + Skipped int `json:"skipped"` + Errors int `json:"errors"` + LastRun time.Time `json:"last_run"` + JobID string `json:"job_id"` + LogURL string `json:"log_url,omitempty"` + JUnitURL string `json:"junit_url,omitempty"` + FailedTests []FailedTestCase `json:"failed_tests,omitempty"` + LLMAnalysis *LLMAnalysis `json:"llm_analysis,omitempty"` +} + +// DeliverableStatus represents the cross-environment test status for one operator+version +type DeliverableStatus struct { + Name string `json:"name"` + Version string `json:"version"` + Results map[string]*EnvironmentResult `json:"results"` // key: "stage", "prod", "integration", "unknown" + LastUpdated time.Time `json:"last_updated"` +} + +// Stage returns the result for the stage environment, or nil if not available. +func (o DeliverableStatus) Stage() *EnvironmentResult { return o.Results["stage"] } + +// Prod returns the result for the prod environment, or nil if not available. +func (o DeliverableStatus) Prod() *EnvironmentResult { return o.Results["prod"] } + +// Integration returns the result for the integration environment. +// Checks both "int" (stored by SQS consumer) and "integration" (legacy). +func (o DeliverableStatus) Integration() *EnvironmentResult { + if r := o.Results["int"]; r != nil { + return r + } + return o.Results["integration"] +} + +// Unknown returns results from runs where the environment could not be determined. +func (o DeliverableStatus) Unknown() *EnvironmentResult { return o.Results["unknown"] } + +// PipelineRun represents one test run of an operator version in one environment +type PipelineRun struct { + Version string `json:"version"` + Env string `json:"env"` // stage, int, prod + Status string `json:"status"` + Date string `json:"date"` + JobID string `json:"job_id"` + LastRun time.Time `json:"last_run"` + LogURL string `json:"log_url,omitempty"` + JUnitURL string `json:"junit_url,omitempty"` + Failed []FailedTestCase `json:"failed_tests,omitempty"` + Total int `json:"total"` + Passed int `json:"passed"` + LLMAnalysis *LLMAnalysis `json:"llm_analysis,omitempty"` +} + +// PipelineHistory holds all historical runs for a single operator, grouped by version +type PipelineHistory struct { + Name string `json:"name"` + Runs []PipelineRun `json:"runs"` // sorted newest first (flat) + Versions []VersionPipeline `json:"versions"` // grouped by version, newest first +} + +// VersionPipeline represents one version of an operator and its run results per env +type VersionPipeline struct { + Version string `json:"version"` + Date string `json:"date"` // date of the most recent run + LastRun time.Time `json:"last_run"` // timestamp of most recent run + EnvRuns map[string]*PipelineRun `json:"env_runs"` // keyed by env: "int", "stage", "prod" +} + +// FailureEntry is one deliverable+version+env that shares a common failure root cause +type FailureEntry struct { + Name string `json:"name"` + Version string `json:"version"` + Env string `json:"env"` + LastRun time.Time `json:"last_run"` + JobID string `json:"job_id"` + LogURL string `json:"log_url,omitempty"` +} + +// FailureGroup groups deliverables that share a similar failure summary +type FailureGroup struct { + FailureMatch string `json:"failure_match"` // first sentence of LLM root cause or failure message — the grouping key + RootCause string `json:"root_cause"` // full LLM root cause (from most recent entry with analysis) + Recommendations []string `json:"recommendations"` // LLM recommendations + Entries []FailureEntry `json:"entries"` // sorted newest first +} + +// HealthStatus represents the health check response +type HealthStatus struct { + Status string `json:"status"` // ok, degraded, error + Version string `json:"version,omitempty"` + Timestamp time.Time `json:"timestamp"` + OCMConnected bool `json:"ocm_connected"` + S3Connected bool `json:"s3_connected"` +} \ No newline at end of file diff --git a/pkg/dashboard/server/server.go b/pkg/dashboard/server/server.go new file mode 100644 index 0000000000..dfbf27b8e7 --- /dev/null +++ b/pkg/dashboard/server/server.go @@ -0,0 +1,555 @@ +package server + +import ( + "context" + "encoding/json" + "fmt" + "io" + "log" + "net/http" + "strings" + "time" + + "github.com/aws/aws-sdk-go/aws" + awss3 "github.com/aws/aws-sdk-go/service/s3" + junit "github.com/joshdk/go-junit" + "github.com/openshift/osde2e/pkg/dashboard/collectors" + "github.com/openshift/osde2e/pkg/dashboard/config" + "github.com/openshift/osde2e/pkg/dashboard/handlers" + "github.com/openshift/osde2e/pkg/dashboard/models" + "github.com/openshift/osde2e/pkg/dashboard/store" +) + +// Server represents the dashboard HTTP server +type Server struct { + config *config.Config + reserveCollector *collectors.ReserveCollector + usageCollector *collectors.UsageCollector + testResultCollector *collectors.TestResultsCollector + deliverableCollector *collectors.DeliverableCollector + store *store.Store // optional; when set, deliverables/history served from DB + mux *http.ServeMux +} + +// NewServer creates a new dashboard server instance +func NewServer(cfg *config.Config) (*Server, error) { + // Initialize collectors + reserveCollector, err := collectors.NewReserveCollector(cfg.OCMEnvironments()...) + if err != nil { + log.Printf("Warning: Failed to initialize reserve collector: %v", err) + reserveCollector = nil + } + + usageCollector, err := collectors.NewUsageCollector(cfg.OCMEnvironments()...) + if err != nil { + log.Printf("Warning: Failed to initialize usage collector: %v", err) + usageCollector = nil + } + + var testResultCollector *collectors.TestResultsCollector + var deliverableCollector *collectors.DeliverableCollector + if cfg.S3Bucket != "" { + testResultCollector, err = collectors.NewTestResultsCollector(cfg.S3Bucket, cfg.S3Region) + if err != nil { + log.Printf("Warning: Failed to initialize test results collector: %v", err) + testResultCollector = nil + } + + deliverableCollector, err = collectors.NewDeliverableCollector(cfg.S3Bucket, cfg.S3Region, cfg.LookbackDays) + if err != nil { + log.Printf("Warning: Failed to initialize deliverable status collector: %v", err) + deliverableCollector = nil + } + } + + srv := &Server{ + config: cfg, + reserveCollector: reserveCollector, + usageCollector: usageCollector, + testResultCollector: testResultCollector, + deliverableCollector: deliverableCollector, + mux: http.NewServeMux(), + } + + // Setup routes + srv.setupRoutes() + + return srv, nil +} + +// setupRoutes configures all HTTP routes +func (s *Server) setupRoutes() { + // HTML pages + s.mux.HandleFunc("/", s.handleRedirect) + s.mux.HandleFunc("/dashboard/usage", s.handleUsagePage) + s.mux.HandleFunc("/dashboard/pipelines", s.handleDeliverablesPage) + s.mux.HandleFunc("/dashboard/pipelines/", s.handlePipelineDetailPage) + s.mux.HandleFunc("/dashboard/analysis", s.handleAnalysisPage) + + // API endpoints + s.mux.HandleFunc("/api/v1/reserves", s.handleReservesAPI) + s.mux.HandleFunc("/api/v1/usage", s.handleUsageAPI) + s.mux.HandleFunc("/api/v1/overview", s.handleOverviewAPI) + s.mux.HandleFunc("/api/v1/deliverables", s.handleDeliverablesAPI) + + // S3 object proxy (streams objects server-side, no presigned URL expiry) + s.mux.HandleFunc("/dashboard/s3", s.handleS3Proxy) + + // JUnit XML viewer + s.mux.HandleFunc("/dashboard/junit", s.handleJUnitReport) + + // Health check + s.mux.HandleFunc("/health", s.handleHealth) +} + +// WithStore attaches a SQLite store to the server. +// When set, the deliverables overview and pipeline-detail pages read from the DB +// instead of making live S3 API calls. +func (s *Server) WithStore(st *store.Store) { + s.store = st +} + +// Start starts the HTTP server and blocks until ctx is cancelled, then shuts down gracefully. +func (s *Server) Start(addr string, ctx context.Context) error { + srv := &http.Server{ + Addr: addr, + Handler: s.mux, + ReadHeaderTimeout: 10 * time.Second, + ReadTimeout: 30 * time.Second, + WriteTimeout: 30 * time.Second, + IdleTimeout: 60 * time.Second, + } + + go func() { + <-ctx.Done() + log.Printf("Shutting down dashboard server...") + shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + _ = srv.Shutdown(shutdownCtx) + }() + + log.Printf("Starting server on %s", addr) + if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed { + return err + } + return nil +} + +// handleRedirect redirects root to /dashboard +func (s *Server) handleRedirect(w http.ResponseWriter, r *http.Request) { + if r.URL.Path == "/" || r.URL.Path == "/dashboard" { + http.Redirect(w, r, "/dashboard/pipelines", http.StatusMovedPermanently) + return + } + http.NotFound(w, r) +} + + +// handleUsagePage serves the Clusters page — all osde2e clusters grouped by env. +func (s *Server) handleUsagePage(w http.ResponseWriter, r *http.Request) { + // EnvOrder defines the display sequence of environments. + envOrder := []string{"int", "stage", "prod"} + + type EnvClusters struct { + Env string + Clusters []models.ClusterReserve + } + + var envClusters []EnvClusters + + if s.reserveCollector != nil { + byEnv, err := s.reserveCollector.CollectClustersPerEnv() + if err != nil { + log.Printf("Warning: Failed to collect clusters per env: %v", err) + } else { + for _, env := range envOrder { + if clusters, ok := byEnv[env]; ok { + envClusters = append(envClusters, EnvClusters{Env: env, Clusters: clusters}) + } + } + } + } + + data := map[string]interface{}{ + "ActivePage": "usage", + "EnvClusters": envClusters, + } + + s.renderTemplate(w, "usage.html", data) +} + +// API Handlers + +// handleReservesAPI returns cluster reserves as JSON +func (s *Server) handleReservesAPI(w http.ResponseWriter, r *http.Request) { + if s.reserveCollector == nil { + s.sendAPIError(w, "Reserve collector not initialized", http.StatusServiceUnavailable) + return + } + + reserves, err := s.reserveCollector.CollectReserves() + if err != nil { + s.sendAPIError(w, fmt.Sprintf("Failed to collect reserves: %v", err), http.StatusInternalServerError) + return + } + + s.sendAPISuccess(w, reserves) +} + +// handleUsageAPI returns cluster usage metrics as JSON +func (s *Server) handleUsageAPI(w http.ResponseWriter, r *http.Request) { + if s.usageCollector == nil { + s.sendAPIError(w, "Usage collector not initialized", http.StatusServiceUnavailable) + return + } + + env := r.URL.Query().Get("environment") + if env != "" { + usage, err := s.usageCollector.CollectUsageByEnvironment(env) + if err != nil { + s.sendAPIError(w, fmt.Sprintf("Failed to collect usage: %v", err), http.StatusInternalServerError) + return + } + s.sendAPISuccess(w, usage) + return + } + + usage, err := s.usageCollector.CollectUsage() + if err != nil { + s.sendAPIError(w, fmt.Sprintf("Failed to collect usage: %v", err), http.StatusInternalServerError) + return + } + + s.sendAPISuccess(w, usage) +} + +// handleDeliverablesPage serves the deliverables pipeline status HTML page. +// When a Store is configured it reads from SQLite (<1ms); otherwise falls back +// to a live S3 scan (slow, legacy path). +func (s *Server) handleDeliverablesPage(w http.ResponseWriter, r *http.Request) { + var deliverables []models.DeliverableStatus + + if s.store != nil { + // Fast path: DB read + result, err := s.store.GetLatest() + if err != nil { + log.Printf("Warning: store.GetLatest: %v", err) + deliverables = []models.DeliverableStatus{} + } else { + deliverables = result + } + } else if s.deliverableCollector != nil { + // Slow path: live S3 scan + collected, err := s.deliverableCollector.CollectDeliverables() + if err != nil { + log.Printf("Warning: Failed to collect deliverable status: %v", err) + deliverables = []models.DeliverableStatus{} + } else { + deliverables = collected + } + } else { + deliverables = []models.DeliverableStatus{} + } + + data := map[string]interface{}{ + "ActivePage": "deliverables", + "Deliverables": deliverables, + "Environments": []string{"stage", "integration"}, + "S3Bucket": s.config.S3Bucket, + } + + s.renderTemplate(w, "deliverables.html", data) +} + +// handlePipelineDetailPage serves the per-deliverable pipeline history page. +// URL: /dashboard/pipelines/ +// When a Store is configured it reads from SQLite (<1ms); otherwise falls back +// to a live S3 scan (slow, legacy path). +func (s *Server) handlePipelineDetailPage(w http.ResponseWriter, r *http.Request) { + name := strings.TrimPrefix(r.URL.Path, "/dashboard/pipelines/") + name = strings.TrimSpace(name) + if name == "" { + http.Redirect(w, r, "/dashboard/pipelines", http.StatusSeeOther) + return + } + + var history *models.PipelineHistory + var err error + + if s.store != nil { + // Fast path: DB read + history, err = s.store.GetHistory(name) + if err != nil { + log.Printf("store.GetHistory %s: %v", name, err) + s.sendError(w, "Failed to load pipeline history", http.StatusInternalServerError) + return + } + } else if s.deliverableCollector != nil { + // Slow path: live S3 scan + history, err = s.deliverableCollector.CollectPipelineHistory(name) + if err != nil { + log.Printf("Failed to collect pipeline history for %s: %v", name, err) + s.sendError(w, "Failed to load pipeline history", http.StatusInternalServerError) + return + } + } else { + history = &models.PipelineHistory{Name: name} + } + + data := map[string]interface{}{ + "ActivePage": "deliverables", + "History": history, + } + + s.renderTemplate(w, "pipeline-detail.html", data) +} + +// handleAnalysisPage groups all failed runs by AI root cause and renders the analysis page. +func (s *Server) handleAnalysisPage(w http.ResponseWriter, r *http.Request) { + var groups []models.FailureGroup + + if s.store != nil { + var err error + groups, err = s.store.GetFailureGroups() + if err != nil { + log.Printf("Warning: GetFailureGroups: %v", err) + groups = []models.FailureGroup{} + } + } + + data := map[string]interface{}{ + "ActivePage": "analysis", + "Groups": groups, + } + + s.renderTemplate(w, "analysis.html", data) +} + +// handleDeliverablesAPI returns deliverable status as JSON +func (s *Server) handleDeliverablesAPI(w http.ResponseWriter, r *http.Request) { + if s.deliverableCollector == nil { + s.sendAPIError(w, "Deliverable collector not initialized (S3 bucket not configured)", http.StatusServiceUnavailable) + return + } + + deliverables, err := s.deliverableCollector.CollectDeliverables() + if err != nil { + s.sendAPIError(w, fmt.Sprintf("Failed to collect deliverable status: %v", err), http.StatusInternalServerError) + return + } + + // Optional ?name= filter + if nameFilter := r.URL.Query().Get("name"); nameFilter != "" { + filtered := deliverables[:0] + for _, op := range deliverables { + if op.Name == nameFilter { + filtered = append(filtered, op) + } + } + deliverables = filtered + } + + s.sendAPISuccess(w, deliverables) +} + +// handleOverviewAPI returns dashboard overview data +func (s *Server) handleOverviewAPI(w http.ResponseWriter, r *http.Request) { + overview := s.collectOverview() + + s.sendAPISuccess(w, overview) +} + +// handleS3Proxy streams an S3 object through the server using its AWS credentials. +// URL: /dashboard/s3?key= +// This avoids presigned URL expiry — the server holds long-lived credentials. +func (s *Server) handleS3Proxy(w http.ResponseWriter, r *http.Request) { + key := r.URL.Query().Get("key") + if key == "" { + http.Error(w, "missing key parameter", http.StatusBadRequest) + return + } + if s.deliverableCollector == nil { + http.Error(w, "S3 not configured", http.StatusServiceUnavailable) + return + } + + s3Client, bucket := s.deliverableCollector.S3Client() + out, err := s3Client.GetObjectWithContext(r.Context(), &awss3.GetObjectInput{ + Bucket: aws.String(bucket), + Key: aws.String(key), + }) + if err != nil { + log.Printf("handleS3Proxy: GetObject %s: %v", key, err) + http.Error(w, "Failed to fetch object from S3", http.StatusBadGateway) + return + } + defer out.Body.Close() + + if ct := aws.StringValue(out.ContentType); ct != "" { + w.Header().Set("Content-Type", ct) + } else if strings.HasSuffix(key, ".log") { + w.Header().Set("Content-Type", "text/plain; charset=utf-8") + } else if strings.HasSuffix(key, ".xml") { + w.Header().Set("Content-Type", "application/xml") + } + _, _ = io.Copy(w, out.Body) +} + +// handleJUnitReport fetches a JUnit XML from S3 and renders it as HTML. +// URL: /dashboard/junit?key= +func (s *Server) handleJUnitReport(w http.ResponseWriter, r *http.Request) { + key := r.URL.Query().Get("key") + if key == "" { + http.Error(w, "missing key parameter", http.StatusBadRequest) + return + } + if s.deliverableCollector == nil { + http.Error(w, "S3 not configured", http.StatusServiceUnavailable) + return + } + + s3Client, bucket := s.deliverableCollector.S3Client() + out, err := s3Client.GetObjectWithContext(r.Context(), &awss3.GetObjectInput{ + Bucket: aws.String(bucket), + Key: aws.String(key), + }) + if err != nil { + log.Printf("handleJUnitReport: GetObject %s: %v", key, err) + s.sendError(w, "Failed to fetch JUnit XML from S3", http.StatusBadGateway) + return + } + defer out.Body.Close() + + suites, err := junit.IngestReader(out.Body) + if err != nil { + log.Printf("handleJUnitReport: parse error: %v", err) + s.sendError(w, "Failed to parse JUnit XML", http.StatusUnprocessableEntity) + return + } + + s.renderTemplate(w, "junit-report.html", map[string]interface{}{ + "ActivePage": "deliverables", + "Suites": suites, + }) +} + +// handleHealth returns server health status +func (s *Server) handleHealth(w http.ResponseWriter, r *http.Request) { + status := models.HealthStatus{ + Status: "ok", + Timestamp: handlers.Now(), + OCMConnected: s.reserveCollector != nil, + S3Connected: s.testResultCollector != nil, + } + + if !status.OCMConnected || !status.S3Connected { + status.Status = "degraded" + } + + s.sendJSON(w, status) +} + +// Helper methods + +// collectOverview aggregates data from all collectors +func (s *Server) collectOverview() *models.DashboardOverview { + overview := &models.DashboardOverview{ + LastUpdated: handlers.Now(), + RecentTests: []models.TestResult{}, + ClusterUsageSummary: []models.ClusterUsage{}, + } + + // Collect reserves + if s.reserveCollector != nil { + reserves, err := s.reserveCollector.CollectReserves() + if err != nil { + log.Printf("Warning: Failed to collect reserves: %v", err) + } else { + overview.TotalReservedClusters = len(reserves) + overview.ClustersExpiringSoon = s.reserveCollector.CountExpiringSoon(reserves, s.config.ExpirationWarningThreshold) + } + } + + // Collect usage + if s.usageCollector != nil { + usage, err := s.usageCollector.CollectUsage() + if err != nil { + log.Printf("Warning: Failed to collect usage: %v", err) + } else { + overview.ClusterUsageSummary = usage + } + } + + // Collect recent tests + if s.testResultCollector != nil { + tests, err := s.testResultCollector.CollectRecentTests(20) // Last 20 tests + if err != nil { + log.Printf("Warning: Failed to collect test results: %v", err) + } else { + overview.RecentTests = tests + overview.ActiveTests = countActiveTests(tests) + overview.OverallSuccessRate = calculateSuccessRate(tests) + } + } + + return overview +} + +// sendJSON sends a JSON response +func (s *Server) sendJSON(w http.ResponseWriter, data interface{}) { + w.Header().Set("Content-Type", "application/json") + if err := json.NewEncoder(w).Encode(data); err != nil { + log.Printf("Error encoding JSON: %v", err) + } +} + +// sendAPISuccess sends a successful API response +func (s *Server) sendAPISuccess(w http.ResponseWriter, data interface{}) { + s.sendJSON(w, models.APIResponse{ + Success: true, + Data: data, + }) +} + +// sendAPIError sends an API error response +func (s *Server) sendAPIError(w http.ResponseWriter, message string, statusCode int) { + w.WriteHeader(statusCode) + s.sendJSON(w, models.APIResponse{ + Success: false, + Error: message, + }) +} + +// sendError sends an error response +func (s *Server) sendError(w http.ResponseWriter, message string, statusCode int) { + http.Error(w, message, statusCode) +} + +// Helper functions + +func countActiveTests(tests []models.TestResult) int { + // For now, consider tests from the last hour as "active" + // This can be refined based on actual test execution patterns + count := 0 + for _, test := range tests { + if handlers.Now().Sub(test.Timestamp).Hours() < 1 { + count++ + } + } + return count +} + +func calculateSuccessRate(tests []models.TestResult) float64 { + if len(tests) == 0 { + return 0 + } + + passed := 0 + for _, test := range tests { + if test.Status == "passed" { + passed++ + } + } + + return float64(passed) / float64(len(tests)) * 100 +} diff --git a/pkg/dashboard/server/templates.go b/pkg/dashboard/server/templates.go new file mode 100644 index 0000000000..46d6f85b96 --- /dev/null +++ b/pkg/dashboard/server/templates.go @@ -0,0 +1,74 @@ +package server + +import ( + "bytes" + "embed" + "html/template" + "log" + "net/http" + "strings" + "time" +) + +//go:embed templates/*.html +var templateFS embed.FS + +var funcMap = template.FuncMap{ + "now": time.Now, + // localTime converts a time.Time to local timezone, formatted as "2006-01-02 15:04 MST" + "localTime": func(t time.Time) string { + return t.Local().Format("2006-01-02 15:04 MST") + }, + // localTimeShort formats as "2006-01-02 15:04" without timezone suffix + "localDate": func(t time.Time) string { + return t.Local().Format("2006-01-02") + }, + // subtract returns a - b (used in templates for failed count = total - passed) + "subtract": func(a, b int) int { + return a - b + }, + // add returns a + b (used in junit-report.html for aggregating totals) + "add": func(a, b int) int { + return a + b + }, + // githubRepo normalises an operator name to its GitHub repo name by stripping + // job-variant suffixes like -e2e-master that are not part of the repo name. + "githubRepo": func(name string) string { + for _, suffix := range []string{"-e2e-master", "-e2e"} { + if idx := strings.Index(name, suffix); idx > 0 { + return name[:idx] + } + } + return name + }, +} + +// renderTemplate renders an HTML template with data. +// Each call parses base.html + the requested page file as a fresh template set +// so that {{define "content"}} blocks from different pages don't collide. +func (s *Server) renderTemplate(w http.ResponseWriter, name string, data interface{}) { + tmpl, err := template.New("").Funcs(funcMap).ParseFS(templateFS, + "templates/base.html", + "templates/"+name, + ) + if err != nil { + log.Printf("Error loading template %s: %v", name, err) + http.Error(w, "Template error", http.StatusInternalServerError) + return + } + + var buf bytes.Buffer + if err := tmpl.ExecuteTemplate(&buf, "base.html", data); err != nil { + log.Printf("Error rendering template %s: %v", name, err) + http.Error(w, "Template rendering error", http.StatusInternalServerError) + return + } + w.Header().Set("Content-Type", "text/html; charset=utf-8") + _, _ = buf.WriteTo(w) +} + +// PageData represents common data passed to all pages +type PageData struct { + ActivePage string + Data interface{} +} diff --git a/pkg/dashboard/server/templates/analysis.html b/pkg/dashboard/server/templates/analysis.html new file mode 100644 index 0000000000..76daf89546 --- /dev/null +++ b/pkg/dashboard/server/templates/analysis.html @@ -0,0 +1,196 @@ +{{template "base.html" .}} + +{{define "title"}}Delivery Dashboard - Analysis{{end}} + +{{define "extra-css"}} + +{{end}} + +{{define "content"}} +

Analysis

+

+ Failed deliverables grouped by AI-identified root cause — most widespread failures first. + Click a deliverable name to see its full pipeline history. +

+ +{{if gt (len .Groups) 0}} + {{range $gi, $group := .Groups}} +
+
+
{{len $group.Entries}} matching
+
{{$group.FailureMatch}}
+ {{if $group.RootCause}} +
+ AI Analysis: {{$group.RootCause}} + {{if gt (len $group.Recommendations) 0}} +
    {{range $group.Recommendations}}
  1. {{.}}
  2. {{end}}
+ {{end}} +
+ {{end}} +
+ + + + + + + + + + + + + {{range $group.Entries}} + + + + + + + + {{end}} + +
DeliverableVersionEnvWhenLogs
+ + {{.Name}} + + {{.Version}} + {{if eq .Env "int"}} + int + {{else if eq .Env "stage"}} + stage + {{else if eq .Env "prod"}} + prod + {{else}} + {{.Env}} + {{end}} + {{localTime .LastRun}} + {{if .LogURL}} + Logs + {{else}} + + {{end}} +
+
+ {{end}} +{{else}} +
+

No failure patterns found

+

AI analysis will appear here once failed runs with summary.yaml are backfilled.

+
+{{end}} +{{end}} diff --git a/pkg/dashboard/server/templates/base.html b/pkg/dashboard/server/templates/base.html new file mode 100644 index 0000000000..17f9c8dd63 --- /dev/null +++ b/pkg/dashboard/server/templates/base.html @@ -0,0 +1,294 @@ + + + + + + {{block "title" .}}Delivery Dashboard{{end}} + + {{block "extra-css" .}}{{end}} + + +
+
+

Delivery Dashboard

+ +
+
+ +
+
+ {{block "content" .}}{{end}} +
+
+ +
+
+

© 2026 Delivery Dashboard | JIRA: SDCICD-1823

+
+
+ + {{block "extra-js" .}}{{end}} + + + diff --git a/pkg/dashboard/server/templates/deliverables.html b/pkg/dashboard/server/templates/deliverables.html new file mode 100644 index 0000000000..63c24ca50a --- /dev/null +++ b/pkg/dashboard/server/templates/deliverables.html @@ -0,0 +1,417 @@ +{{template "base.html" .}} + +{{define "title"}}Delivery Dashboard - Pipelines{{end}} + + +{{define "extra-css"}} + +{{end}} + +{{define "content"}} +

Pipelines

+

+ Latest test result per deliverable per environment — sourced from persisted osde2e-logs S3 bucket. + Click a status badge to see failure details. Click a deliverable name to see full history. +

+ +
+ + +
+ +
+ {{if gt (len .Deliverables) 0}} + + + + + + + + + + {{range $i, $op := .Deliverables}} + + + + {{/* Integration */}} + + + {{/* Stage */}} + + + + {{end}}{{/* end range .Deliverables */}} + +
DeliverableIntegrationStage
{{$op.Name}} + {{with $op.Integration}} + + {{else}} + + {{end}} + + {{with $op.Stage}} + + {{else}} + + {{end}} +
+ + {{/* Dialogs must live outside to produce valid HTML */}} + {{range $i, $op := .Deliverables}} + {{with $op.Stage}} + +
+ {{$op.Name}} — Stage + +
+
+
+
Status
+
{{if eq .Status "passed"}}Passed{{else}}{{.Status}}{{end}}
+
Tag
{{.Version}}
+
Tests
{{.Passed}} passed / {{.Failed}} failed / {{.Total}} total
+
Last run
{{.LastRun.Format "2006-01-02 15:04 UTC"}}
+
Job suffix
{{.JobID}}
+
+ {{with .LLMAnalysis}} +
+
AI Analysis
+
{{.RootCause}}
+ {{if gt (len .Recommendations) 0}} +
Recommendations: +
    {{range .Recommendations}}
  1. {{.}}
  2. {{end}}
+
+ {{end}} +
+ {{end}} +
+ +
+ {{end}} + {{with $op.Integration}} + +
+ {{$op.Name}} — Integration + +
+
+
+
Status
+
{{if eq .Status "passed"}}Passed{{else}}{{.Status}}{{end}}
+
Tag
{{.Version}}
+
Tests
{{.Passed}} passed / {{.Failed}} failed / {{.Total}} total
+
Last run
{{.LastRun.Format "2006-01-02 15:04 UTC"}}
+
Job suffix
{{.JobID}}
+
+ {{with .LLMAnalysis}} +
+
AI Analysis
+
{{.RootCause}}
+ {{if gt (len .Recommendations) 0}} +
Recommendations: +
    {{range .Recommendations}}
  1. {{.}}
  2. {{end}}
+
+ {{end}} +
+ {{end}} +
+ +
+ {{end}} + {{end}}{{/* end dialog range */}} + {{else}} +
+

No deliverable results found

+

Results will appear here once tests have run and S3 is configured.

+
+ {{end}} +
+{{end}} + +{{define "extra-js"}} + +{{end}} \ No newline at end of file diff --git a/pkg/dashboard/server/templates/junit-report.html b/pkg/dashboard/server/templates/junit-report.html new file mode 100644 index 0000000000..12a78eaa13 --- /dev/null +++ b/pkg/dashboard/server/templates/junit-report.html @@ -0,0 +1,229 @@ +{{define "title"}}JUnit Report — Delivery Dashboard{{end}} + +{{define "extra-css"}} + +{{end}} + +{{define "content"}} +{{/* Aggregate totals across all suites */}} +{{$total := 0}}{{$passed := 0}}{{$failed := 0}}{{$skipped := 0}}{{$errors := 0}} +{{range .Suites}} + {{$total = (add $total .Totals.Tests)}} + {{$passed = (add $passed .Totals.Passed)}} + {{$failed = (add $failed .Totals.Failed)}} + {{$skipped = (add $skipped .Totals.Skipped)}} + {{$errors = (add $errors .Totals.Error)}} +{{end}} + +
+
{{$total}}
Total
+
{{$passed}}
Passed
+
{{add $failed $errors}}
Failed
+ {{if gt $skipped 0}} +
{{$skipped}}
Skipped
+ {{end}} +
+ +{{/* Failures first */}} +{{$anyFailures := false}} +{{range .Suites}}{{range .Tests}}{{if or (eq .Status "failed") (eq .Status "error")}}{{$anyFailures = true}}{{end}}{{end}}{{end}} + +{{if $anyFailures}} +
Failures
+{{range .Suites}} + {{$suiteName := .Name}} + {{range .Tests}} + {{if or (eq .Status "failed") (eq .Status "error")}} +
+ + {{if .Classname}}{{.Classname}} / {{end}}{{.Name}} + {{.Duration}} + {{.Status}} + +
+ {{if .Message}}
Message
{{.Message}}
{{end}} + {{if .Error}}
Detail
{{.Error}}
{{end}} + {{if .SystemOut}}
Stdout
{{.SystemOut}}
{{end}} + {{if .SystemErr}}
Stderr
{{.SystemErr}}
{{end}} +
+
+ {{end}} + {{end}} +{{end}} +{{end}} + +{{/* Passing tests */}} +{{$anyPassed := false}} +{{range .Suites}}{{range .Tests}}{{if eq .Status "passed"}}{{$anyPassed = true}}{{end}}{{end}}{{end}} + +{{if $anyPassed}} +
Passed
+{{range .Suites}} + {{$suiteName := .Name}} + {{$suiteHasPassed := false}} + {{range .Tests}}{{if eq .Status "passed"}}{{$suiteHasPassed = true}}{{end}}{{end}} + {{if $suiteHasPassed}} +
{{$suiteName}}
+ {{range .Tests}} + {{if eq .Status "passed"}} +
+ + {{if .Classname}}{{.Classname}} / {{end}}{{.Name}} + {{.Duration}} + passed + +
+ {{if .SystemOut}}
Stdout
{{.SystemOut}}
+ {{else if .SystemErr}}
Stderr
{{.SystemErr}}
+ {{else}}No output captured.{{end}} +
+
+ {{end}} + {{end}} + {{end}} +{{end}} +{{end}} + +{{/* Skipped tests */}} +{{$anySkipped := false}} +{{range .Suites}}{{range .Tests}}{{if eq .Status "skipped"}}{{$anySkipped = true}}{{end}}{{end}}{{end}} + +{{if $anySkipped}} +
Skipped
+{{range .Suites}} + {{range .Tests}} + {{if eq .Status "skipped"}} +
+ + {{if .Classname}}{{.Classname}} / {{end}}{{.Name}} + {{.Duration}} + skipped + +
+ {{if .Message}}
{{.Message}}
{{else}}No skip reason provided.{{end}} +
+
+ {{end}} + {{end}} +{{end}} +{{end}} + +{{if eq $total 0}} +

No test cases found in this JUnit XML report.

+{{end}} +{{end}} diff --git a/pkg/dashboard/server/templates/pipeline-detail.html b/pkg/dashboard/server/templates/pipeline-detail.html new file mode 100644 index 0000000000..42f1496513 --- /dev/null +++ b/pkg/dashboard/server/templates/pipeline-detail.html @@ -0,0 +1,319 @@ +{{template "base.html" .}} + +{{define "title"}}Delivery Dashboard - {{.History.Name}}{{end}} + +{{define "extra-css"}} + +{{end}} + +{{define "content"}} +← Back to Pipelines + +

{{.History.Name}}

+

+ Each row is a version of this deliverable. Runs flow int → stage. Click a failed node to see details. +

+ +
+{{if gt (len .History.Versions) 0}} + + {{/* Column headers */}} +
+
Version
+
+
+
Int
+
+
Stage
+
+
+ +
+ {{range $vi, $vp := .History.Versions}} + + {{$intRun := index $vp.EnvRuns "int"}} + {{$stageRun := index $vp.EnvRuns "stage"}} + +
+ {{/* Version label */}} +
+ {{$vp.Version}} + {{$vp.LastRun.Format "2006-01-02"}} +
+ + {{/* Pipeline flow: connector → int node → arrow → stage node */}} +
+
+ + {{/* Int node */}} +
+ {{if $intRun}} + {{if eq $intRun.Status "passed"}} + + {{else}} + + {{end}} + {{else}} + + {{end}} +
+ + {{/* Arrow */}} +
+ + {{/* Stage node */}} +
+ {{if $stageRun}} + {{if eq $stageRun.Status "passed"}} + + {{else}} + + {{end}} + {{else}} + + {{end}} +
+
+
+ + {{/* Dialogs for failed runs */}} + {{if $intRun}}{{if ne $intRun.Status "passed"}} + +
+ {{$vp.Version}} — Int — {{$intRun.Date}} + +
+
+
+
Status
{{$intRun.Status}}
+
Tests
{{$intRun.Passed}} passed / {{subtract $intRun.Total $intRun.Passed}} failed / {{$intRun.Total}} total
+
Run at
{{$intRun.LastRun.Format "2006-01-02 15:04 UTC"}}
+
Job suffix
{{$intRun.JobID}}
+
+ {{with $intRun.LLMAnalysis}} +
+
AI Analysis
+
{{.RootCause}}
+ {{if gt (len .Recommendations) 0}} +
Recommendations: +
    {{range .Recommendations}}
  1. {{.}}
  2. {{end}}
+
+ {{end}} +
+ {{end}} +
+ +
+ {{end}}{{end}} + + {{if $stageRun}}{{if ne $stageRun.Status "passed"}} + +
+ {{$vp.Version}} — Stage — {{$stageRun.Date}} + +
+
+
+
Status
{{$stageRun.Status}}
+
Tests
{{$stageRun.Passed}} passed / {{subtract $stageRun.Total $stageRun.Passed}} failed / {{$stageRun.Total}} total
+
Run at
{{$stageRun.LastRun.Format "2006-01-02 15:04 UTC"}}
+
Job suffix
{{$stageRun.JobID}}
+
+ {{with $stageRun.LLMAnalysis}} +
+
AI Analysis
+
{{.RootCause}}
+ {{if gt (len .Recommendations) 0}} +
Recommendations: +
    {{range .Recommendations}}
  1. {{.}}
  2. {{end}}
+
+ {{end}} +
+ {{end}} +
+ +
+ {{end}}{{end}} + + {{end}}{{/* end range Versions */}} +
+ +{{else}} +
+

No historical runs found for {{.History.Name}}

+

Runs will appear here once test results are uploaded to S3.

+
+{{end}} +
+{{end}} + +{{define "extra-js"}} + +{{end}} diff --git a/pkg/dashboard/server/templates/usage.html b/pkg/dashboard/server/templates/usage.html new file mode 100644 index 0000000000..183d08acd1 --- /dev/null +++ b/pkg/dashboard/server/templates/usage.html @@ -0,0 +1,218 @@ +{{template "base.html" .}} + +{{define "title"}}Delivery Dashboard - Infra{{end}} + +{{define "extra-css"}} + +{{end}} + +{{define "content"}} +

Infra

+

All osde2e clusters across environments. Click an environment heading to collapse.

+ +{{if gt (len .EnvClusters) 0}} + {{range .EnvClusters}} +
+

+ + {{.Env}} + {{len .Clusters}} clusters +

+
+ {{if gt (len .Clusters) 0}} + + + + + + + + + + + + + + + {{range .Clusters}} + + + + + + + + + + + {{end}} + +
Cluster IDStateAvailabilityVersionFlavorAd Hoc ImageCreatedExpires
{{.ID}} + {{if eq .State "ready"}} + ● ready + {{else if eq .State "installing"}} + ◌ installing + {{else if eq .State "error"}} + ✗ error + {{else}} + {{.State}} + {{end}} + + {{$avail := index .Properties "Availability"}} + {{if eq $avail "reserved"}} + reserved + {{else if eq $avail "claimed"}} + claimed + {{else if eq $avail "used"}} + used + {{else if $avail}} + {{$avail}} + {{else}} + + {{end}} + {{.Version}}{{.Product}} + {{$img := index .Properties "AdHocTestImages"}} + {{if $img}} + {{$img}} + {{else}} + + {{end}} + {{localTime .CreatedAt}} + {{if .ExpiresAt.IsZero}}—{{else}}{{localTime .ExpiresAt}}{{end}} +
+ {{else}} +

No clusters found for this environment.

+ {{end}} +
+
+ {{end}} +{{else}} +
+
+

No cluster data available

+

OCM credentials may not be configured, or no osde2e clusters exist.

+
+
+{{end}} +{{end}} + +{{define "extra-js"}} + +{{end}} diff --git a/pkg/dashboard/store/store.go b/pkg/dashboard/store/store.go new file mode 100644 index 0000000000..830628c174 --- /dev/null +++ b/pkg/dashboard/store/store.go @@ -0,0 +1,513 @@ +// Package store provides a SQLite-backed persistence layer for pipeline results. +// It is written to by the SQS consumer (incremental) and the backfill job (bulk), +// and read by the dashboard HTTP handlers for sub-millisecond page loads. +package store + +import ( + "database/sql" + "encoding/json" + "fmt" + "log" + "strings" + "time" + + _ "modernc.org/sqlite" // pure-Go SQLite driver, no CGO required + + "github.com/openshift/osde2e/pkg/dashboard/models" +) + +const schema = ` +PRAGMA journal_mode=WAL; +PRAGMA foreign_keys=ON; + +-- Latest result per (operator, env) — used by the Pipelines overview table. +CREATE TABLE IF NOT EXISTS pipeline_latest ( + name TEXT NOT NULL, + env TEXT NOT NULL, + version TEXT NOT NULL DEFAULT 'unknown', + status TEXT NOT NULL DEFAULT 'unknown', + passed INTEGER NOT NULL DEFAULT 0, + failed INTEGER NOT NULL DEFAULT 0, + total INTEGER NOT NULL DEFAULT 0, + job_id TEXT NOT NULL DEFAULT '', + last_run DATETIME NOT NULL, + log_url TEXT NOT NULL DEFAULT '', + junit_url TEXT NOT NULL DEFAULT '', + failed_tests TEXT NOT NULL DEFAULT '[]', -- JSON []FailedTestCase + llm_analysis TEXT NOT NULL DEFAULT '', -- JSON LLMAnalysis or empty + PRIMARY KEY (name, env) +); + +-- Every individual run — used by the pipeline-detail history page. +CREATE TABLE IF NOT EXISTS pipeline_runs ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + name TEXT NOT NULL, + env TEXT NOT NULL, + version TEXT NOT NULL DEFAULT 'unknown', + status TEXT NOT NULL DEFAULT 'unknown', + passed INTEGER NOT NULL DEFAULT 0, + failed INTEGER NOT NULL DEFAULT 0, + total INTEGER NOT NULL DEFAULT 0, + job_id TEXT NOT NULL DEFAULT '', + date TEXT NOT NULL DEFAULT '', + last_run DATETIME NOT NULL, + log_url TEXT NOT NULL DEFAULT '', + junit_url TEXT NOT NULL DEFAULT '', + failed_tests TEXT NOT NULL DEFAULT '[]', -- JSON []FailedTestCase + llm_analysis TEXT NOT NULL DEFAULT '', -- JSON LLMAnalysis or empty + UNIQUE (name, env, job_id) -- deduplicate on re-process +); + +CREATE INDEX IF NOT EXISTS idx_runs_operator ON pipeline_runs (name, last_run DESC); + +-- Migration: add llm_analysis column to existing DBs that predate this field. +-- SQLite ignores "duplicate column" errors but this pattern avoids them. +` + +// Store wraps the SQLite database connection and provides typed query methods. +type Store struct { + db *sql.DB +} + +// Open opens (or creates) the SQLite database at path and applies the schema. +// Use ":memory:" for an in-memory database (useful for tests). +func Open(path string) (*Store, error) { + db, err := sql.Open("sqlite", path) + if err != nil { + return nil, fmt.Errorf("open sqlite %s: %w", path, err) + } + + // SQLite performs best with a single writer connection. + db.SetMaxOpenConns(1) + + if _, err := db.Exec(schema); err != nil { + db.Close() + return nil, fmt.Errorf("apply schema: %w", err) + } + + // Best-effort migrations for existing databases. + for _, tbl := range []string{"pipeline_latest", "pipeline_runs"} { + _, _ = db.Exec(`ALTER TABLE ` + tbl + ` ADD COLUMN llm_analysis TEXT NOT NULL DEFAULT ''`) + _, _ = db.Exec(`ALTER TABLE ` + tbl + ` RENAME COLUMN operator_name TO name`) + // Rewrite old presigned/plain S3 URLs to dashboard proxy URLs. + // Extract the S3 key by stripping the bucket hostname prefix, then build /dashboard/s3?key= or /dashboard/junit?key= URLs. + // Handles both https://bucket.s3.amazonaws.com/key?presign and previously-migrated https://bucket.s3.amazonaws.com/key forms. + _, _ = db.Exec(`UPDATE ` + tbl + ` SET log_url = '/dashboard/s3?key=' || SUBSTR(SUBSTR(log_url, 1, INSTR(log_url||'?', '?')-1), INSTR(SUBSTR(log_url, 1, INSTR(log_url||'?', '?')-1), '.amazonaws.com/') + LENGTH('.amazonaws.com/')) WHERE log_url LIKE 'https://%'`) + _, _ = db.Exec(`UPDATE ` + tbl + ` SET junit_url = '/dashboard/junit?key=' || SUBSTR(SUBSTR(junit_url, 1, INSTR(junit_url||'?', '?')-1), INSTR(SUBSTR(junit_url, 1, INSTR(junit_url||'?', '?')-1), '.amazonaws.com/') + LENGTH('.amazonaws.com/')) WHERE junit_url LIKE 'https://%'`) + } + + log.Printf("Store: opened SQLite at %s", path) + return &Store{db: db}, nil +} + +// Close closes the underlying database connection. +func (s *Store) Close() error { return s.db.Close() } + +// Truncate removes all rows from pipeline_latest and pipeline_runs. +// Called before a full backfill so stale rows (S3 objects that have been deleted) don't persist. +func (s *Store) Truncate() error { + for _, tbl := range []string{"pipeline_latest", "pipeline_runs"} { + if _, err := s.db.Exec(`DELETE FROM ` + tbl); err != nil { + return fmt.Errorf("truncate %s: %w", tbl, err) + } + } + return nil +} + +// RunRecord is the flat struct used when writing to the store. +type RunRecord struct { + Name string + Env string + Version string + Status string + Passed int + Failed int + Total int + JobID string + Date string + LastRun time.Time + LogURL string + JUnitURL string + FailedTests []models.FailedTestCase + LLMAnalysis *models.LLMAnalysis +} + +// UpsertRun inserts or updates both pipeline_latest and pipeline_runs for one run result. +func (s *Store) UpsertRun(r RunRecord) error { + ft, err := json.Marshal(r.FailedTests) + if err != nil { + return fmt.Errorf("marshal failed_tests: %w", err) + } + + llmStr := "" + if r.LLMAnalysis != nil { + b, err := json.Marshal(r.LLMAnalysis) + if err != nil { + return fmt.Errorf("marshal llm_analysis: %w", err) + } + llmStr = string(b) + } + + tx, err := s.db.Begin() + if err != nil { + return fmt.Errorf("begin tx: %w", err) + } + defer tx.Rollback() //nolint:errcheck + + // Upsert pipeline_latest — only overwrite if this run is newer. + _, err = tx.Exec(` + INSERT INTO pipeline_latest + (name, env, version, status, passed, failed, total, job_id, last_run, log_url, junit_url, failed_tests, llm_analysis) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + ON CONFLICT(name, env) DO UPDATE SET + version = excluded.version, + status = excluded.status, + passed = excluded.passed, + failed = excluded.failed, + total = excluded.total, + job_id = excluded.job_id, + last_run = excluded.last_run, + log_url = excluded.log_url, + junit_url = excluded.junit_url, + failed_tests = excluded.failed_tests, + llm_analysis = excluded.llm_analysis + WHERE excluded.last_run > pipeline_latest.last_run + `, + r.Name, r.Env, r.Version, r.Status, + r.Passed, r.Failed, r.Total, + r.JobID, r.LastRun, r.LogURL, r.JUnitURL, + string(ft), llmStr, + ) + if err != nil { + return fmt.Errorf("upsert pipeline_latest: %w", err) + } + + // Insert pipeline_runs — ignore duplicate job_id. + _, err = tx.Exec(` + INSERT OR IGNORE INTO pipeline_runs + (name, env, version, status, passed, failed, total, job_id, date, last_run, log_url, junit_url, failed_tests, llm_analysis) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + `, + r.Name, r.Env, r.Version, r.Status, + r.Passed, r.Failed, r.Total, + r.JobID, r.Date, r.LastRun, r.LogURL, r.JUnitURL, + string(ft), llmStr, + ) + if err != nil { + return fmt.Errorf("insert pipeline_runs: %w", err) + } + + return tx.Commit() +} + +// GetLatest returns all rows from pipeline_latest as []models.DeliverableStatus, +// grouped by operator name (one entry per operator, results keyed by env). +func (s *Store) GetLatest() ([]models.DeliverableStatus, error) { + rows, err := s.db.Query(` + SELECT name, env, version, status, passed, failed, total, + job_id, last_run, log_url, junit_url, failed_tests, llm_analysis + FROM pipeline_latest + ORDER BY name, env + `) + if err != nil { + return nil, fmt.Errorf("query pipeline_latest: %w", err) + } + defer rows.Close() + + index := make(map[string]*models.DeliverableStatus) + var order []string + + for rows.Next() { + var ( + name, env, ver, status string + passed, failed, total int + jobID, logURL, junitURL string + lastRun time.Time + ftJSON, llmJSON string + ) + if err := rows.Scan(&name, &env, &ver, &status, &passed, &failed, &total, + &jobID, &lastRun, &logURL, &junitURL, &ftJSON, &llmJSON); err != nil { + return nil, fmt.Errorf("scan pipeline_latest: %w", err) + } + + var failedTests []models.FailedTestCase + _ = json.Unmarshal([]byte(ftJSON), &failedTests) + + var llm *models.LLMAnalysis + if llmJSON != "" { + llm = &models.LLMAnalysis{} + if err := json.Unmarshal([]byte(llmJSON), llm); err != nil { + llm = nil + } + } + + er := &models.EnvironmentResult{ + Version: ver, + Status: status, + Passed: passed, + Failed: failed, + Total: total, + JobID: jobID, + LastRun: lastRun, + LogURL: logURL, + JUnitURL: junitURL, + FailedTests: failedTests, + LLMAnalysis: llm, + } + + op, ok := index[name] + if !ok { + op = &models.DeliverableStatus{ + Name: name, + Results: make(map[string]*models.EnvironmentResult), + } + index[name] = op + order = append(order, name) + } + op.Results[env] = er + if lastRun.After(op.LastUpdated) { + op.LastUpdated = lastRun + } + } + if err := rows.Err(); err != nil { + return nil, err + } + + result := make([]models.DeliverableStatus, 0, len(order)) + for _, name := range order { + result = append(result, *index[name]) + } + return result, nil +} + +// GetHistory returns all pipeline_runs for a given operator, newest first. +func (s *Store) GetHistory(operatorName string) (*models.PipelineHistory, error) { + rows, err := s.db.Query(` + SELECT env, version, status, passed, failed, total, + job_id, date, last_run, log_url, junit_url, failed_tests, llm_analysis + FROM pipeline_runs + WHERE name = ? + ORDER BY last_run DESC + `, operatorName) + if err != nil { + return nil, fmt.Errorf("query pipeline_runs: %w", err) + } + defer rows.Close() + + var runs []models.PipelineRun + for rows.Next() { + var ( + env, ver, status string + passed, failed, total int + jobID, date string + logURL, junitURL string + lastRun time.Time + ftJSON, llmJSON string + ) + if err := rows.Scan(&env, &ver, &status, &passed, &failed, &total, + &jobID, &date, &lastRun, &logURL, &junitURL, &ftJSON, &llmJSON); err != nil { + return nil, fmt.Errorf("scan pipeline_runs: %w", err) + } + + var failedTests []models.FailedTestCase + _ = json.Unmarshal([]byte(ftJSON), &failedTests) + + var llm *models.LLMAnalysis + if llmJSON != "" { + llm = &models.LLMAnalysis{} + if err := json.Unmarshal([]byte(llmJSON), llm); err != nil { + llm = nil + } + } + + runs = append(runs, models.PipelineRun{ + Env: env, + Version: ver, + Status: status, + Passed: passed, + Total: total, + JobID: jobID, + Date: date, + LastRun: lastRun, + LogURL: logURL, + JUnitURL: junitURL, + Failed: failedTests, + LLMAnalysis: llm, + }) + } + if err := rows.Err(); err != nil { + return nil, err + } + + // Group runs by version, preserving newest-first order per version. + // For each version we keep the newest run per env (int takes precedence over stage over prod). + type versionKey = string + versionOrder := []versionKey{} + versionMap := make(map[versionKey]*models.VersionPipeline) + + for i := range runs { + run := &runs[i] + vp, exists := versionMap[run.Version] + if !exists { + vp = &models.VersionPipeline{ + Version: run.Version, + Date: run.Date, + LastRun: run.LastRun, + EnvRuns: make(map[string]*models.PipelineRun), + } + versionMap[run.Version] = vp + versionOrder = append(versionOrder, run.Version) + } + // Keep newest run per env (runs are already newest-first so first wins). + if _, seen := vp.EnvRuns[run.Env]; !seen { + vp.EnvRuns[run.Env] = run + } + if run.LastRun.After(vp.LastRun) { + vp.LastRun = run.LastRun + vp.Date = run.Date + } + } + + versions := make([]models.VersionPipeline, 0, len(versionOrder)) + for _, ver := range versionOrder { + versions = append(versions, *versionMap[ver]) + } + + return &models.PipelineHistory{ + Name: operatorName, + Runs: runs, + Versions: versions, + }, nil +} + +// groupKeySummary extracts a stable grouping key from an LLM root cause or failure message. +// It takes the first sentence (up to the first '.') capped at 120 chars, then normalises +// to lowercase with punctuation/quotes stripped so minor LLM phrasing differences still cluster. +func groupKeySummary(text string) string { + if text == "" { + return "" + } + s := text + // Trim to first sentence + if idx := strings.Index(s, "."); idx > 0 && idx < 120 { + s = s[:idx] + } else if len(s) > 120 { + s = s[:120] + } + // Normalise: lowercase, strip quotes and leading/trailing punctuation + s = strings.ToLower(s) + s = strings.Map(func(r rune) rune { + switch r { + case '\'', '"', '`', '‘', '’', '“', '”': + return -1 // drop quote characters + } + return r + }, s) + return strings.TrimSpace(s) +} + +// GetFailureGroups returns all failed runs grouped by the first sentence of the LLM root cause +// (falling back to the first line of the failure message). Groups with the same summary cluster +// across deliverables. Sorted by number of entries descending. +func (s *Store) GetFailureGroups() ([]models.FailureGroup, error) { + rows, err := s.db.Query(` + SELECT name, env, version, job_id, last_run, log_url, failed_tests, llm_analysis + FROM pipeline_runs + WHERE status != 'passed' AND (failed_tests != '[]' OR llm_analysis != '') + ORDER BY last_run DESC + `) + if err != nil { + return nil, fmt.Errorf("query failure groups: %w", err) + } + defer rows.Close() + + type groupKey = string + groupOrder := []groupKey{} + groups := make(map[groupKey]*models.FailureGroup) + + for rows.Next() { + var ( + name, env, ver, jobID string + logURL, ftJSON, llmJSON string + lastRun time.Time + ) + if err := rows.Scan(&name, &env, &ver, &jobID, &lastRun, &logURL, &ftJSON, &llmJSON); err != nil { + return nil, fmt.Errorf("scan failure groups: %w", err) + } + + var llm *models.LLMAnalysis + if llmJSON != "" { + llm = &models.LLMAnalysis{} + if err := json.Unmarshal([]byte(llmJSON), llm); err != nil || llm.RootCause == "" { + llm = nil + } + } + + var failedTests []models.FailedTestCase + _ = json.Unmarshal([]byte(ftJSON), &failedTests) + + // Determine grouping key: prefer LLM root cause summary, fall back to first failure message line + var key string + if llm != nil { + key = groupKeySummary(llm.RootCause) + } + if key == "" && len(failedTests) > 0 { + // First line of the first failure message + msg := failedTests[0].Message + if nl := strings.Index(msg, "\n"); nl > 0 { + msg = msg[:nl] + } + key = groupKeySummary(msg) + } + if key == "" { + continue + } + + entry := models.FailureEntry{ + Name: name, + Version: ver, + Env: env, + LastRun: lastRun, + JobID: jobID, + LogURL: logURL, + } + + if _, exists := groups[key]; !exists { + grp := &models.FailureGroup{ + FailureMatch: key, + } + if llm != nil { + grp.RootCause = llm.RootCause + grp.Recommendations = llm.Recommendations + } + groups[key] = grp + groupOrder = append(groupOrder, key) + } + grp := groups[key] + grp.Entries = append(grp.Entries, entry) + // Enrich with LLM if the group doesn't have it yet + if llm != nil && grp.RootCause == "" { + grp.RootCause = llm.RootCause + grp.Recommendations = llm.Recommendations + } + } + if err := rows.Err(); err != nil { + return nil, err + } + + result := make([]models.FailureGroup, 0, len(groupOrder)) + for _, key := range groupOrder { + result = append(result, *groups[key]) + } + // Sort: largest groups first + for i := 0; i < len(result)-1; i++ { + for j := i + 1; j < len(result); j++ { + if len(result[j].Entries) > len(result[i].Entries) { + result[i], result[j] = result[j], result[i] + } + } + } + + return result, nil +} + diff --git a/scripts/dashboard/deploy.sh b/scripts/dashboard/deploy.sh new file mode 100755 index 0000000000..32748e0fc8 --- /dev/null +++ b/scripts/dashboard/deploy.sh @@ -0,0 +1,57 @@ +#!/bin/bash +# Local dev deploy — builds image from source, pushes to quay, applies kustomize overlay. +# Not used by CI/prod. Set SQS_QUEUE_URL in overlays/local/configmap.yaml in hp-delivery-apps. +# +# Usage: DASHBOARD_QUAY_IMAGE=quay.io//delivery-dashboard:latest ./scripts/dashboard/deploy.sh +# Env: DASHBOARD_QUAY_IMAGE (required), QUAY_EXPIRE (e.g. 26w), OVERLAY (default: overlays/local) + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)" + +[[ -z "${DASHBOARD_QUAY_IMAGE:-}" ]] && { echo "Error: DASHBOARD_QUAY_IMAGE is not set."; exit 1; } + +IMAGE="${DASHBOARD_QUAY_IMAGE}" +QUAY_EXPIRE="${QUAY_EXPIRE:-}" +OVERLAY="${OVERLAY:-overlays/local}" +OVERLAY_DIR="$(cd "${REPO_ROOT}/../hp-delivery-apps" && pwd)/delivery-dashboard/${OVERLAY}" +NAMESPACE=$(grep "^namespace:" "${OVERLAY_DIR}/kustomization.yaml" | awk '{print $2}') +APP="delivery-dashboard" +BUILD_CTX="${REPO_ROOT}/configs/local/dashboard-build" + +echo "=== Delivery Dashboard Deployment ===" +echo "Overlay: ${OVERLAY} (namespace: ${NAMESPACE})" +echo "Image: ${IMAGE}" +echo "Cluster: $(oc whoami --show-server)" +echo "" + +oc new-project "${NAMESPACE}" 2>/dev/null || oc project "${NAMESPACE}" + +echo "Checking secrets..." +MISSING=0 +for SECRET in osde2e-ocm-credentials osde2e-aws-credentials; do + oc get secret "${SECRET}" -n "${NAMESPACE}" &>/dev/null \ + && echo " OK: ${SECRET}" \ + || { echo " MISSING: ${SECRET}"; MISSING=1; } +done +[[ "${MISSING}" -eq 1 ]] && { echo "Create missing secrets first (see hp-delivery-apps/delivery-dashboard/README.md)"; exit 1; } + +echo "[1/4] Building image..." +GOOS=linux GOARCH=amd64 GOFLAGS="-mod=mod" go build -o "${BUILD_CTX}/osde2e" "${REPO_ROOT}/cmd/osde2e/" +EXPIRE_ARG="${QUAY_EXPIRE:+--label quay.expires-after=${QUAY_EXPIRE}}" +# shellcheck disable=SC2086 +podman build ${EXPIRE_ARG} --platform linux/amd64 -t "${IMAGE}" "${BUILD_CTX}" + +echo "[2/4] Pushing image..." +podman push "${IMAGE}" + +echo "[3/4] Applying manifests..." +kustomize build "${OVERLAY_DIR}" | oc apply -f - +oc rollout restart "deployment/${APP}" -n "${NAMESPACE}" + +echo "[4/4] Waiting for rollout..." +oc rollout status "deployment/${APP}" -n "${NAMESPACE}" --timeout=120s + +echo "" +echo "Dashboard URL: https://$(oc get route live -n "${NAMESPACE}" -o jsonpath='{.spec.host}')/dashboard/pipelines" \ No newline at end of file diff --git a/scripts/dashboard/verify-build.sh b/scripts/dashboard/verify-build.sh new file mode 100755 index 0000000000..fbb23d43d6 --- /dev/null +++ b/scripts/dashboard/verify-build.sh @@ -0,0 +1,137 @@ +#!/bin/bash +# Build verification script for osde2e dashboard + +set -e + +echo "=== osde2e Dashboard Build Verification ===" +echo "" + +# Check Go version +echo "1. Checking Go version..." +if command -v go &> /dev/null; then + go version +else + echo "ERROR: Go not found in PATH" + exit 1 +fi +echo "" + +# Check if we're in the right directory +echo "2. Checking directory..." +if [ ! -f "go.mod" ]; then + echo "ERROR: Not in osde2e root directory" + exit 1 +fi +echo "✓ In osde2e root directory" +echo "" + +# Verify dashboard files exist +echo "3. Verifying dashboard files..." +FILES=( + "pkg/dashboard/models/types.go" + "pkg/dashboard/config/config.go" + "pkg/dashboard/collectors/reserves.go" + "pkg/dashboard/collectors/usage.go" + "pkg/dashboard/collectors/s3tests.go" + "pkg/dashboard/server/server.go" + "pkg/dashboard/server/templates.go" + "pkg/dashboard/handlers/utils.go" + "cmd/osde2e/dashboard/cmd.go" +) + +for file in "${FILES[@]}"; do + if [ -f "$file" ]; then + echo "✓ $file" + else + echo "✗ MISSING: $file" + exit 1 + fi +done +echo "" + +# Verify templates exist +echo "4. Verifying HTML templates..." +TEMPLATES=( + "pkg/dashboard/server/templates/base.html" + "pkg/dashboard/server/templates/dashboard.html" + "pkg/dashboard/server/templates/reserves.html" + "pkg/dashboard/server/templates/usage.html" + "pkg/dashboard/server/templates/tests.html" +) + +for template in "${TEMPLATES[@]}"; do + if [ -f "$template" ]; then + echo "✓ $template" + else + echo "✗ MISSING: $template" + exit 1 + fi +done +echo "" + +# Check for syntax errors (gofmt) +echo "5. Checking Go syntax..." +DASHBOARD_FILES=$(find pkg/dashboard cmd/osde2e/dashboard -name "*.go" 2>/dev/null) +if [ -n "$DASHBOARD_FILES" ]; then + gofmt -l $DASHBOARD_FILES > /tmp/dashboard-fmt-check.txt + if [ -s /tmp/dashboard-fmt-check.txt ]; then + echo "⚠ Files need formatting:" + cat /tmp/dashboard-fmt-check.txt + else + echo "✓ All files properly formatted" + fi +else + echo "⚠ No Go files found" +fi +echo "" + +# Try to build dashboard package +echo "6. Building dashboard package..." +if go build -v ./pkg/dashboard/... 2>&1 | tee /tmp/dashboard-build.log; then + echo "✓ Dashboard package builds successfully" +else + echo "✗ Build failed. See /tmp/dashboard-build.log for details" + exit 1 +fi +echo "" + +# Try to build main osde2e with dashboard +echo "7. Building osde2e with dashboard command..." +if go build -o /tmp/osde2e ./cmd/osde2e 2>&1 | tee /tmp/osde2e-build.log; then + echo "✓ osde2e builds successfully with dashboard command" +else + echo "✗ Build failed. See /tmp/osde2e-build.log for details" + exit 1 +fi +echo "" + +# Verify dashboard command is registered +echo "8. Verifying dashboard command..." +if grep -q "dashboard.Cmd" cmd/osde2e/main.go; then + echo "✓ Dashboard command registered in main.go" +else + echo "✗ Dashboard command NOT registered in main.go" + exit 1 +fi +echo "" + +# Test help command +echo "9. Testing dashboard help..." +if /tmp/osde2e dashboard --help > /tmp/dashboard-help.txt 2>&1; then + echo "✓ Dashboard help command works" + echo "" + echo "=== Dashboard Help Output ===" + cat /tmp/dashboard-help.txt +else + echo "✗ Dashboard help command failed" + exit 1 +fi +echo "" + +echo "===================================" +echo "✅ All verification checks passed!" +echo "===================================" +echo "" +echo "Dashboard is ready to use. Start with:" +echo " ./osde2e dashboard --port 8080" +echo ""