Related skills
incident response incident management reliability engineering site reliability engineering playbooksπ Description
- Own end-to-end incident management: response, escalation, and remediation tracking.
- Run incident commander rotations for clear ownership during incidents.
- Drive follow-up: capture, track, and close action items across teams.
- Partner with engineering to improve incident tooling coherence.
- Lead incident review forums; ensure learnings are acted on.
- Develop and maintain incident docs, playbooks, and training materials.
π― Requirements
- Have 7+ years of experience in technical program management, incident management, or site reliability engineering.
- Have led incident response programs at a technology company, ideally in a high-growth or infrastructure-intensive environment.
- Are comfortable participating in on-call responsibilities and leading incident response during high-severity events.
- Have experience building and scaling operational processes from the ground up.
- Excel at driving accountability and followthrough across multiple teams without direct authority.
- Are highly organized with a knack for bringing structure to ambiguous situations.
π Benefits
- Competitive compensation and benefits
- Optional equity donation matching
- Generous vacation and parental leave
- Flexible working hours
- SF office space
π Visa sponsorship
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!