Related skills
datadog azure terraform aws prometheus๐ Description
- Develop and adopt AI-powered tools to make Development and Operations processes more efficient
- Collaborate with developers and weather scientists to optimize performance, reliability, scale, security, and cost
- Evolve and maintain adaptive cloud infrastructure for growth and scale
- Build self-service platforms for scientists and developers to work independently
- Introduce and integrate MLOps practices for GPU-based model deployment on Kubernetes
- Maintain Production availability by participating in DevOps on-call shifts
๐ฏ Requirements
- 4+ years as DevOps/SRE in Linux; AWS/GCP/Azure and IaC (Terraform or Crossplane)
- Experience with CI/CD tools and Kubernetes deployment
- Strong ownership and accountability for service reliability
- Comfort with AI-powered tools and willingness to experiment
- Experience implementing monitoring (Datadog, Prometheus, ELK)
- Experience in agile environments with high-velocity teams
- Proficiency in Python, Node.js, and Go
- Adaptable problem-solving mindset in changing environments
๐ Benefits
- Flexible hours
- Unlimited vacation days policy
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!