Related skills
sre change management observability reliability incident management📋 Description
- Build and lead a 24/7 team of reliability and observability engineers.
- Document provisioning, validation, and troubleshooting of server nodes.
- Drive automation and event-driven remediation to improve resilience.
- Provide 24/7 engineering support for high-criticality node delivery.
- Enhance onboarding, documentation, enablement, and performance management.
- Shape culture and communications to enable CoreWeave across teams.
🎯 Requirements
- 7+ years in software or infra engineering with 2+ years in leadership.
- Strong SRE fundamentals, incident management, observability, and change management.
- Champion automation and cross-team tooling to improve reliability.
- Enjoy helping people grow; extend influence to partners and leadership.
- Experience leading reliability programs for high-scale fleets.
- Strong communication and leadership skills.
🎁 Benefits
- Medical, dental, and vision insurance—100% paid by CoreWeave.
- Company-paid Life Insurance.
- 401(k) with generous employer match.
- Flexible PTO and Paid Parental Leave.
- Tuition Reimbursement and ESPP.
- Catered lunch and a casual work environment.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Operations Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!