Added
1 hour ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
terraform github actions helm grafana prometheusπ Description
- Own the reliability and performance of our Kubernetes-based data platform.
- Design and operate highly available, multi-region systems with uptime targets.
- Scale infrastructure, improve deployment pipelines, and harden security posture.
- Evolve DevSecOps practices while partnering with engineering for reliability from day one.
π― Requirements
- 5+ years in SRE/Platform/Infra roles.
- Kubernetes and containerized services expertise (cluster design, ops).
- CI/CD with Argo CD and GitHub Actions.
- Ownership of prod systems with HA β₯99.99%, incident response, SLI/SLO/SLA.
- Geo-replicated multi-region active-active design (routing, failover, data consistency).
- Observability with Prometheus, Grafana, OpenTelemetry.
π Benefits
- Medical, dental, and vision insurance - 100% paid by CoreWeave
- 401(k) with generous employer match
- Flexible PTO
- Tuition Reimbursement
- Ability to Participate in Employee Stock Purchase Program (ESPP)
- Mental Wellness Benefits through Spring Health
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!