Related skills
kubernetes ai infrastructure gpu nvidia triton๐ Description
- Team Leadership & Development: Recruit, mentor, coach engineers.
- Execution & Delivery: Own roadmaps, milestones, on-time delivery.
- Cross-Functional Partnership: Align with Product Mgmt, other teams, stakeholders.
- Operational Health: Ensure production health, stability, on-call rotation.
- Strategic Architecture & Planning: Roadmap for high-throughput scheduling on large clusters.
- Maximize GPU Utilization: Fractional GPU allocation and fair scheduling.
๐ฏ Requirements
- Engineering Leadership Experience: Leading high-performing teams.
- Kubernetes & AI Infrastructure: Scale, orchestration, AI workloads.
- Hardware-Aware Optimization: GPU architectures, interconnects, topology.
- Resource & Cost Management: Balance perf with cost, DRF principles.
- AI/ML Serving Architectures: LLM serving patterns and engines (vLLM, Triton).
- Observability & SLOs: Define and monitor infra metrics to drive improvements.
๐ Benefits
- Career development resources and growth opportunities.
- Well-being programs and support (EAP).
- Flexible time off and local meetups.
- Conference/training reimbursements and LinkedIn Learning access.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!