Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
kubernetes go gpu slurm sunk📋 Description
- Lead architecture for cluster orchestration across Kubernetes, Slurm, SUNK, and Kueue.
- Define long-term architecture and solve scaling problems across schedulers and control planes.
- Balance performance, reliability, cost, and complexity in AI infrastructure.
- Lead evolution of Kubernetes-native control planes and custom operators.
- Design workload admission, validation, and rollout, including model onboarding flows.
- Mentor senior and staff engineers, influencing platform, security, and product teams.
🎯 Requirements
- 15+ years building and operating large-scale distributed systems.
- Deep knowledge of Kubernetes and Slurm internals.
- Experience running GPU-heavy AI training, inference, or HPC workloads.
- Strong Go and cloud-native systems development background.
- Proven ability to set technical direction across teams without direct authority.
- Bachelor’s or Master’s degree in a relevant field, or equivalent experience.
🎁 Benefits
- Medical, dental, and vision insurance — 100% paid.
- Company-paid life insurance.
- Voluntary supplemental life insurance.
- Short and long-term disability insurance.
- Flexible Spending Account.
- Health Savings Account.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!