VP of Product, Research and Training Infrastructure

Added
17 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

kubernetes rlhf hpc rl slurm

πŸ“‹ Description

  • Lead product strategy for Research Training Stack and orchestration tools.
  • Evolve SUNK (Slurm on Kubernetes) for deterministic bare-metal HPC.
  • Drive training services and eval frameworks for model quality.
  • Build RL/RLHF pipelines enabling efficient model refinement.
  • Partner with global AI labs to translate research needs into roadmaps.

🎯 Requirements

  • 15+ years engineering leadership experience.
  • 5+ years managing large-scale infra at AI labs or cloud providers.
  • Domain expertise: Slurm, Kubernetes, InfiniBand/RDMA for training.
  • Research mindset: frontier model pre/post-training experience.
  • Scaling: multi-thousand GPU clusters (H100/Blackwell/Rubin).
  • Strategic vision for next-gen AI stack (RL loops, sandbox envs).

🎁 Benefits

  • Hybrid work; remote options for eligible candidates.
  • Onboarding at a hub within the first month; quarterly team gatherings.
  • Comprehensive benefits: medical, dental, vision, life.
  • 401(k) with match; equity awards; ESPP.
  • Tuition reimbursement; parental leave; flexible PTO.
  • Casual, innovative culture focused on disruption.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Product Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Product Jobs

See more Product jobs β†’