Staff Site Reliability Engineer I

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

datadog docker terraform github actions aws

๐Ÿ“‹ Description

  • Own the technical direction of Remote's SRE/Platform domain and roadmap.
  • Define reliability strategy: SLOs/SLIs, error budgets, observability, incidents.
  • Lead cross-team infra initiatives from discovery to delivery.
  • Identify and drive AI enablement across the engineering org to reduce toil.
  • Drive AI-powered automation for platform ops: alerting, triage, and self-healing runbooks.

๐ŸŽฏ Requirements

  • 8+ years of experience in SRE/DevOps/Platform Engineering.
  • Deep Kubernetes expertise: operating, designing, and scaling production clusters.
  • Cloud infra at scale on AWS.
  • Terraform infrastructure as code practice.
  • SLOs/SLIs, error budgets, and alerting strategies.
  • Observability with Datadog, Grafana, and Prometheus.

๐ŸŽ Benefits

  • work from anywhere
  • flexible paid time off
  • flexible working hours
  • 16 weeks paid parental leave
  • stock options
  • learning budget
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’