Added
13 days ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
python kubernetes pytorch machine learning goπ Description
- Develop Kubernetes-native benchmarking services for latency, throughput, jitter, and cost-per-request across CoreWeave's compute stack.
- Contribute to end-to-end MLPerf Training/Inference workflows, including workload setup, cluster config, and result validation.
- Lead design discussions and influence architecture within the team.
- Break down tasks into milestones and deliver reliable, high-quality code.
- Mentor junior engineers and share benchmarking best practices across teams.
π― Requirements
- 3β5 years of experience building distributed systems, HPC components, or cloud services.
- Strong Python or Go skills (C++ a plus) with networking and performance fundamentals.
- Hands-on Kubernetes in production plus CI/CD and observability tools (Prometheus, Grafana, OpenTelemetry).
- Experience with GPU systems (CUDA, NCCL, NVLink/PCIe) or model-serving stacks (llm-d, vLLM, TensorRT-LLM).
- Effective communicator comfortable cross-functionally.
π Benefits
- Medical, dental, and vision insurance β 100% paid by CoreWeave
- Company-paid Life Insurance
- Short and long-term disability insurance
- Flexible Spending Account
- Health Savings Account
- 401(k) with a generous employer match
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!