Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
azure terraform aws python kubernetesπ Description
- Lead efforts to improve reliability, scalability, and performance
- Define and implement SLIs/SLOs and error budgets
- Build observability systems with actionable alerts and minimal noise
- Lead complex incident response as incident commander
- Conduct postmortems focused on systemic causes
- Eliminate toil via automation and better tooling
π― Requirements
- 6-10+ years in SRE, infrastructure, or backend systems
- Proven reliability ownership for complex distributed systems
- Strong cloud experience (AWS, GCP, or Azure)
- Deep observability, incident management, and performance know-how
- Proficiency in Go, Python, or Java for automation
- Able to influence teams without formal managerial authority
π Benefits
- Impactful work shaping customer experience
- Dynamic, collaborative culture
- Comprehensive benefits: health, vision, 401(k), commuter benefits
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!