Inference Runtime, Engineering Manager

Added
2 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

pytorch distributed systems cuda infiniband mpi

πŸ“‹ Description

  • Lead engineers in distributed systems and model architecture.
  • Collaborate with ML researchers, engineers, and PMs to deploy tech.
  • Contribute across infra to performance tuning.
  • Introduce techniques and tools to improve inference performance.
  • Build tools to identify bottlenecks and address high-priority issues.
  • Optimize code and GPU fleet to maximize FLOPs and RAM.

🎯 Requirements

  • Understanding of modern ML architectures and inference optimization.
  • Own problems end-to-end; self-directed to fill gaps.
  • 15+ years of software engineering experience.
  • Familiar with PyTorch, NVIDIA GPUs, CUDA, NCCL; HPC tech (InfiniBand, MPI, NVLink).
  • Experience architecting, building, observing, debugging production distributed systems.
  • Refactor production systems at rapidly increasing scale.

🎁 Benefits

  • Equal opportunity employer.
  • OpenAI Affirmative Action and Equal Employment Policy.
  • Reasonable accommodations for applicants with disabilities.
  • OpenAI Global Applicant Privacy Policy.
  • Background checks as required by law.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’