Senior Manager, Cloud Platform & Site Reliability

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

gitops terraform prometheus kubernetes multi-cloud

πŸ“‹ Description

  • Lead, grow, and develop Cloud Platform and SRE team leads.
  • Set org-level direction and roadmap for infrastructure and reliability.
  • Own platform reliability end-to-end; enforce SLOs/SLIs and incident response.
  • Drive cross-functional collaboration with product, engineering, and customers.
  • Oversee incident management for high-severity issues and rapid resolutions.
  • Translate operational pain points into roadmap priorities and runbook improvements.

🎯 Requirements

  • Bachelor's, Master's, or Ph.D. in CS, Engineering, Math, or related field.
  • Proven experience managing managers and multiple SRE/infra teams.
  • Deep Kubernetes expertise (multi-cloud: EKS, GKE) and distributed systems.
  • Hands-on with Terraform or Pulumi and CI/CD tools (GitHub Actions, Jenkins).
  • Strong observability with Prometheus, OpenTelemetry, Grafana.
  • Experience owning incident management and enterprise SLAs.

🎁 Benefits

  • Competitive compensation with meaningful equity.
  • 100% medical, dental, and vision for you and dependents.
  • Flexible PTO including Winter Break.
  • Paid parental leave.
  • Fertility and family-building stipend through Carrot.
  • 401(k) with company match.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’