Related skills
azure terraform grafana prometheus opentelemetry๐ Description
- Design and implement monitoring strategies across systems.
- Collaborate on Grafana, Prometheus/Loki, Azure Monitor.
- Define golden signals, SLOs/SLIs, and actionable alerting.
- Drive logging, tracing, and OpenTelemetry standards for teams.
- Lead production incident response and remediation.
- Conduct blameless post-incident reviews and follow-ups.
๐ฏ Requirements
- 5+ years in SRE/Production Engineering or related roles.
- Strong knowledge of cloud-native systems, preferably Azure.
- Experience with observability tooling (Grafana/Prometheus/Loki/Azure Monitor).
- Understanding of DR concepts, failover validation, and readiness.
- Strong grasp of SRE principles: SLOs/SLIs, error budgets, toil.
- Strong collaboration and communication skills.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!