Site Reliability Engineer

Added
less than a minute ago
Type
Full time
Salary
Salary not provided

Related skills

sre terraform aws kubernetes gcp

๐Ÿ“‹ Description

  • Define and own SLOs, SLIs, and error budgets for Devin and Windsurf.
  • Build monitoring, alerting, and observability for service health.
  • Lead incident response with speed and blameless postmortems.
  • Create runbooks and tooling for sustainable on-call.
  • Own CI/CD pipelines and deployment infrastructure.
  • Reduce toil with automation and developer tooling.

๐ŸŽฏ Requirements

  • Deep exp running production systems at scale: SLOs, on-call, incident command.
  • Strong software fundamentals; SRE writes real code, not just configuring tools.
  • Cloud infra (AWS, GCP, or Azure), Kubernetes, and Terraform.
  • Experience building and owning CI/CD pipelines and deployment infrastructure.
  • Strong observability instincts; instrument systems and design useful alerts.
  • Proven track record reducing toil through automation.

๐ŸŽ Benefits

  • Small, selective team shipping products used by thousands of developers.
  • High ownership and trust; set the reliability bar.
  • Environment rewards proactive, systematic reliability as a craft.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’