Staff Site Reliability Engineer — Project Volcano

Added
7 days ago
Type
Full time
Salary
Salary not provided

Related skills

datadog terraform helm postgresql redis

📋 Description

  • Own end-to-end Volcano reliability: define SLOs, error budgets, and incident response.
  • Design multi-region Kubernetes infra and data plane for Volcano.
  • Implement GitOps: automation, canaries, previews with ArgoCD, Helm, Terraform/Terragrunt.
  • Scale multi-tenant PostgreSQL, Redis, and object storage; ensure isolation and DR.
  • Instrument services with SLIs; dashboards, alerts, runbooks with Datadog, Prometheus, Grafana.
  • Collaborate with OCTO, product, and security to bake reliability into architecture.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →