Related skills
datadog node.js docker terraform aws📋 Description
- Design scalable blueprints for global automation platform with high availability.
- Define SLIs, SLOs, and error budgets to balance velocity and reliability.
- Build and maintain observability pipelines with metrics, logs, and traces.
- Participate in incident resolution and blameless postmortems to improve reliability.
- Cultivate a learning culture from outages to harden the platform.
- Develop and automate CI/CD pipelines with canary or blue/green releases.
🎯 Requirements
- 6+ years of experience in Software Engineering or SRE with technical leadership
- Thorough understanding of applying SLI and SLO principles for reliability
- Deep proficiency in Linux/Unix-based infrastructure at scale
- Extensive experience with cloud providers, strong preference for AWS
- Expert-level Kubernetes in production
- Infrastructure as Code using Terraform
🎁 Benefits
- RSUs grant and annual bonus
- Multinational team with 42 nationalities
- Learning & Development plan with 2 learning days per year
- Laptop (MacBook) and 34'' curved monitor provided
- 25 vacation days, 4 sick days
- Remote working allowance
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!