Senior Site Reliability Engineer (SRE)

Added
21 minutes ago
Type
Full time
Salary
Salary not provided

Related skills

terraform aws python kubernetes eks

๐Ÿ“‹ Description

  • Lead design of large-scale, multi-region infra focusing on availability and latency.
  • Own the SRE roadmap: SLIs/SLOs, error budgets, reliability OKRs.
  • Architect observability platforms (metrics, logs, traces) for deep visibility.
  • Lead major incidents; run cross-team war rooms; drive blameless post-mortems.
  • Champion tooling for infra automation, chaos engineering, and capacity planning.
  • Mentor SREs and partner with Eng/Prod/Sec to align platform with business goals.

๐ŸŽฏ Requirements

  • 8+ years in SRE/Platform engineering; at least 3 years at senior level.
  • Bachelor's or Master's in CS/Engineering or equivalent.
  • Amazon EKS expert: cluster lifecycle, custom node groups, Karpenter.
  • Advanced Istio: control plane, Envoy filters, multi-cluster mesh.
  • Advanced AWS: EKS, VPC, IAM (IRSA), Route 53, ALB/NLB, RDS, S3; Terraform.
  • Strong networking across Kubernetes/EKS; VPC, Transit Gateway; packet capture a plus.
  • GitLab CI/CD at scale; GitOps, security scanning; Python/Go tooling; SLIs/SLOs.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’