Skip to content

Custos-com/Burnless

██████╗ ██╗   ██╗██████╗ ███╗   ██╗██╗     ███████╗███████╗███████╗
██╔══██╗██║   ██║██╔══██╗████╗  ██║██║     ██╔════╝██╔════╝██╔════╝
██████╔╝██║   ██║██████╔╝██╔██╗ ██║██║     █████╗  ███████╗███████╗
██╔══██╗██║   ██║██╔══██╗██║╚██╗██║██║     ██╔══╝  ╚════██║╚════██║
██████╔╝╚██████╔╝██║  ██║██║ ╚████║███████╗███████╗███████║███████║
╚═════╝  ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═══╝╚══════╝╚══════╝╚══════╝╚══════╝

Stop burning your error budget. Stop burning out your team.

Full SRE config as code — SLOs, error budgets, runbooks, on-call, and dashboards in one repo. Auto-remediates before alerts even fire.

CI Release Go Version License Discord


The problem

Your SLOs live in a Datadog dashboard. Your runbooks live in Confluence — stale since last quarter. Your alert thresholds are configured manually in Grafana, differently in every environment.

None of it is versioned. None of it is reviewable. None of it executes itself at 3am.

Burnless changes that.


Quick start

# Install
curl -fsSL https://burnless.dev/install.sh | sh

# Scaffold a new sre.yaml
burnless init

# Validate
burnless validate sre.yaml

# Deploy to Prometheus + Grafana + PagerDuty
burnless apply sre.yaml

# Watch live burn rate
burnless status payments-api

# Start the auto-remediation agent
burnless agent start

The sre.yaml

service: payments-api
team: platform-engineering

slos:
  - name: availability
    target: 99.9%
    window: 30d
    indicator:
      metric: http_requests_total
      good_filter: 'status!~"5.."'

error_budget:
  burn_rate_alerts:
    - severity: critical
      rate: 14.4x
      window: 1h
      remediate: scale-up

    - severity: warning
      rate: 6x
      window: 6h
      remediate: restart-pods

runbooks:
  scale-up:
    mode: auto
    steps:
      - kubectl scale deploy/payments --replicas=+2
      - wait 60s
      - assert slo.availability > 99.5%

  restart-pods:
    mode: semi-auto
    steps:
      - kubectl rollout restart deploy/payments

oncall:
  provider: pagerduty
  escalation_minutes: 10
  notify_slack: "#sre-incidents"

dashboards:
  provider: grafana
  auto_generate: true
  panels:
    - error_budget_remaining
    - burn_rate_1h
    - burn_rate_6h

Documentation

Community

License

Apache 2.0 — see LICENSE.

License

Burnless uses a three-tier license strategy:

Layer Files License
SDK & Schema pkg/ schema/ examples/ Apache 2.0 — maximum ecosystem reach
Core CLI & Agent cmd/ internal/ deploy/ AGPLv3 — free forever including SSO
SaaS Dashboard saas/ dashboard/ BSL 1.1 — free for dev, paid for production SaaS

TL;DR: If you're an SRE engineer using the CLI or self-hosting the agent — it's free, forever. If you want to offer a managed Burnless service — contact us.

See licenses/ for full license texts.

About

Stop burning your error budget. SRE config as code — SLOs, error budgets, runbooks, on-call, and dashboards in one sre.yaml file. Auto-remediates before alerts fire.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors