Related skills
kubernetes go observability incident management openstack๐ Description
- Perform pillars of Site Reliability Engineering.
- Develop platforms to observe and heal platform health.
- Act as incident commander.
- Reduce incident rate and MTTR.
- Guide domain teams to more reliable designs.
- Run chaos experiments with domain teams and derive actions from results.
๐ฏ Requirements
- Passion for well-architected, elegant platform tools.
- Go programming experience is a plus.
- Proactive in identifying problems and preventing recurrence.
- Knowledge of distributed systems: fault tolerance, consistency, reliability, availability.
- Experience with Kubernetes and CNCF tools; OpenStack/vCloud is a plus.
- Technical English proficiency; curious about system failures; OSI-layer knowledge.
๐ Benefits
- Hybrid working model with flexible schedule, including work-from-abroad options.
- Flexible FlexBenefits budget for meals, health insurance, and credits.
- Well-being support: doctors, psychologists, dietitian; HPV vaccination.
- Personalised training allowance and LMS access.
- Ownership from day one in a collaborative culture.
- International team with offices across Europe; growth opportunities.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!