Related skills
java aws prometheus python kubernetes๐ Description
- Build and maintain highly reliable, scalable systems.
- Design dashboards for OS/platform and app metrics (RED/USE).
- Establish SLIs/SLOs and error budgets for services.
- Implement performance monitoring and alerting to prevent issues.
- On-call rotations and incident response leadership.
- Drive infrastructure automation and deployment.
๐ฏ Requirements
- 5+ years in SRE/DevOps or related field.
- Proficiency in at least two languages Python, Shell, Java, NodeJS.
- Cloud: AWS, GCP, or Azure.
- Docker and Kubernetes containerization.
- Monitoring with Prometheus, Grafana, ELK stack.
- Infrastructure as Code: Terraform, Ansible, or similar.
- Git version control.
๐ Benefits
- Blameless post-mortems to learn from failures.
- Automation-first culture; toil reduction.
- Growth opportunities in large-scale systems.
- Work-life balance and sustainable on-call practices.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to DevOps Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!