Related skills
azure terraform grafana prometheus gcpπ Description
- Design monitoring and observability across services and infrastructure.
- Develop dashboards, alerts, metrics for visibility and reliability.
- Define alerting standards reflecting real service impact.
- Lead troubleshooting during complex incidents and root cause analysis.
- Improve monitoring coverage for critical services and components.
- Partner with service owners to align monitoring with SLAs/KPIs.
π― Requirements
- 6-8 years of SRE, platform ops, or infra reliability.
- Led technical initiatives or reliability practices within engineering teams.
- Experience managing production environments and large-scale systems.
- Terraform-based Infrastructure as Code.
- Cloud platforms with GCP or Azure.
- Experience supporting high-availability 24/7 environments.
π Benefits
- Flexible working hours
- Your birthday off
- Access to cutting-edge technology
- Nimble, small-business feel with learning freedom
- Inclusive culture with active networks
- Work-life balance
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to DevOps Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!