Staff Software Engineer, Ads ML Inference Infrastructure

Added
2 hours ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

tensorflow pytorch cuda inference triton

๐Ÿ“‹ Description

  • Lead next-gen model inference and feature serving for 100x larger models.
  • Design low-latency, high-throughput inference pipelines to meet SLOs.
  • Collaborate to productionize new model architectures (LLMs, ranking) and scale globally.
  • Evolve online feature platform for coverage, freshness, consistency.
  • Evaluate GPU acceleration, model compression, Triton, vLLM, Dynamo.
  • Partner with infra/ML teams to boost reliability and velocity.

๐ŸŽฏ Requirements

  • BS degree in Computer Science or related field.
  • ~8+ years designing/operating large-scale ML or distributed infra.
  • Deep knowledge of Java, C++, Python.
  • Distributed systems or ads infra (routing, storage, caching).
  • Hands-on with PyTorch or TensorFlow.
  • Proven track record leading complex projects and mentoring.

๐ŸŽ Benefits

  • Hybrid work model; in-person 1-2 days per week near Palo Alto/SF/Seattle.
  • PinFlex flexible working options and information page.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’