Related skills
datadog azure aws gcp vmware๐ Description
- Lead live service monitoring, incident response, and service restoration.
- Act as primary technical escalation point for TOC during high-severity incidents.
- Partner with Infra, SRE, and Development to improve observability and availability.
- Refine, document, and champion TOC SOPs and Runbooks across shifts.
- Establish and audit operational best practices for deployment and monitoring.
๐ฏ Requirements
- 5+ years in high-volume TOC/SRE/NOC roles
- 2+ years leading major incident management
- Deep cloud infra experience: AWS, Azure, or GCP
- VMware tech: vSphere, ESXi, vCenter
- Excellent communication and cross-team collaboration
- Bachelor's degree preferred; ITIL/SRE certs a plus
๐ Benefits
- Great company culture and innovation
- Growth opportunities and collaboration
- Work hard, enjoy life with team events
- Comprehensive benefits package and wellness programs
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!