Staff Site Reliability Engineer

Added
10 hours ago
Type
Full time
Salary
Salary not provided

Related skills

ansible terraform cloudformation aws python

📋 Description

  • Establish and evolve SRE best practices across the org, including incident response and postmortems.
  • Define and drive observability strategy with SLIs/SLOs, dashboards, and alerting quality.
  • Design and implement software-driven infrastructure solutions, automating toil.
  • Act as technical leader, guiding priorities across cloud infra, tooling, and platform architecture.
  • Own large, ambiguous initiatives from concept to delivery, aligning stakeholders.
  • Mentor engineers, provide architecture guidance, and review designs for reliability.

🎯 Requirements

  • Bachelor’s or Master’s degree in Computer Science or equivalent practical experience.
  • 7+ years in SRE/infrastructure/platform engineering with impact at scale.
  • Deep troubleshooting across the stack from application to kernel to network.
  • AWS experience preferred; GCP or Azure acceptable.
  • Kubernetes and container orchestration (EKS, Helm) with strong observability systems.
  • Infrastructure as Code using Terraform, Pulumi, CloudFormation, or Ansible.

🎁 Benefits

  • Impact healthcare for millions of patients.
  • Rapidly growing, collaborative, cross-functional team.
  • Equal opportunity employer with an inclusive culture.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →