Senior Engineer 2: Inference Data Plane

Added
25 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

grpc python go vllm sglang

πŸ“‹ Description

  • Technical leadership: drive end-to-end design of data plane components for AI models.
  • System design: architect high-scale, multi-tenant AI inference cloud with resiliency.
  • Performance optimization: use tensor/data parallelism, KV cache, and smart routing.
  • Collaboration: work with PMs, customer teams, and engineers to align roadmaps.
  • Mentorship: coach junior engineers and foster technical excellence.
  • Operational excellence: maintain high-scale services with observability and SLOs.

🎯 Requirements

  • Distributed Systems Expertise: microservices, messaging systems, databases, and IaC.
  • AI/ML domain knowledge: hosting LLMs and multimodal models with engines like vLLM, SGLang.
  • Inference frameworks: llm-d, NVIDIA Dynamo, Ray Serve.
  • Hardware & Interconnects: GPU optimization; NVlink, XGMI, RoCE.
  • Architecture proficiency: LLM architectures and optimization (continuous batching, quantization).
  • Software engineering: Go or Python; gRPC.

🎁 Benefits

  • Career development resources and LinkedIn Learning.
  • Well-being: EAP, local meetups, flexible time off.
  • Equity grants and Employee Stock Purchase Program.
  • Bonus potential based on performance.
  • Competitive benefits and equal opportunity employer.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’