Staff Software Engineer, Inference Runtime

Added
14 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

rust python kubernetes cuda triton

πŸ“‹ Description

  • Set technical direction for the team's shared runtime architecture and roadmap
  • Own and evolve the accelerator-agnostic runtime, with Rust and Python
  • Reduce expansion cost by aligning new models with the core runtime
  • Drive efficient accelerator usage: scheduling, memory across GPU/TPU/Trainium
  • Build validation surfaces: partitioned builds, canary/shadow/rollback
  • Collaborate with Infrastructure on compilers, build systems, and tooling

🎯 Requirements

  • Deep background in systems engineering or ML infrastructure with perf profiling
  • Depth in at least one accelerator ecosystem (CUDA/GPU, TPU, or Trainium)
  • Extensive software engineering in high-performance distributed systems serving millions
  • Proven ability to define and use SLOs and metrics to drive latency and throughput
  • Experience aligning across org boundaries and contributing to shared infrastructure
  • Strong written and verbal communication; ability to influence technical direction

🎁 Benefits

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Lovely office space to collaborate

πŸ›ƒ Visa sponsorship

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’