Related skills
ansible puppet grafana python kubernetesπ Description
- Design and maintain monitoring solutions covering metrics, logs, and traces
- Deploy and manage monitoring infrastructure using Grafana, ICINGA2, Site24x7
- Respond to automated alerts and production incidents
- Participate in on-call rotations supporting global operations
- Build automation to reduce operational overhead and improve reliability
π― Requirements
- 5+ years in Site Reliability Engineering, DevOps, or infrastructure operations
- 5+ years of Linux/Unix systems expertise and performance troubleshooting
- 3+ years of experience with observability platforms (metrics, logging, tracing, visualization)
- 3+ years of configuration management experience (Ansible, Puppet, or similar)
- 3+ years of professional experience in scripting (Python, Go, Bash, JavaScript, etc.)
- 3+ years of professional experience with event correlation or incident platforms
π Benefits
- Competitive health benefits and retirement plans
- Diversity and inclusion commitments and ERGs
- Flexible remote work options with occasional office visits
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!