Added
20 days ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
python kubernetes go cuda tritonπ Description
- Lead architecture, performance, and reliability across services.
- Drive cross-team design initiatives for inference workloads.
- Optimize latency, throughput, and GPU utilization in production.
- Tackle scheduling, batching, and memory optimization in Kubernetes infra.
- Provide hands-on technical leadership shaping engineering direction.
- Work on distributed systems and Kubernetes-based infrastructure.
π― Requirements
- 8β12+ years building large-scale distributed systems or cloud platforms.
- Proven cross-team leadership across multiple services.
- Strong Go, Python, or C++ programming skills.
- Kubernetes production-scale expertise.
- Experience with inference frameworks: vLLM, Triton, TorchServe.
- GPU systems experience: CUDA, NCCL, RDMA, NUMA.
π Benefits
- Medical, dental, and vision insurance β 100% paid by CoreWeave.
- 401(k) with generous employer match.
- Flexible PTO.
- Tuition Reimbursement.
- Employee Stock Purchase Program (ESPP).
- Parental leave and family-forming support.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!