Senior Staff Site Reliability Engineer

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

datadog terraform grafana prometheus python

πŸ“‹ Description

  • Standardize SRE procedures across service catalog
  • Instrument SLOs/SLIs and alerting by burn rate
  • Build internal tooling in Go/Python
  • Create executive dashboards with Grafana/Datadog
  • Reduce toil via automation
  • Collaborate on infrastructure code reviews

🎯 Requirements

  • 8+ years in SRE/Prod Eng/DevOps
  • Strong software engineering (Go, Python)
  • Deep understanding of SLIs and error budgets
  • Kubernetes (EKS/GKE/self-managed)
  • IaC: Terraform or Pulumi
  • Prometheus, Grafana, Datadog, Jaeger observability

🎁 Benefits

  • Pay-for-performance culture
  • Inclusive, equal-opportunity employer
  • Hybrid in San Jose, CA
  • Reasonable accommodations available
  • Diverse, collaborative team
  • Growth and learning opportunities
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’