Senior Engineer 2: Inference Data Plane

Added
1 minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

grpc python microservices distributed systems go

πŸ“‹ Description

  • Technical leadership on data plane components for large generative AI models.
  • Design scalable, multi-tenant AI inference cloud with high availability.
  • Implement distributed inference hosting with tensor/data parallelism.
  • Cross-functional collaboration with PMs, customers, and other engineers.
  • Mentor junior engineers and drive technical excellence.
  • Operate high-scale services with observability and SLOs.

🎯 Requirements

  • Distributed systems: microservices, messaging, databases, IaC.
  • AI/ML domain knowledge: hosting LLMs with vLLM, SGLang, Modular.
  • Inference frameworks: llm-d, NVIDIA Dynamo, Ray Serve.
  • GPU hardware & interconnects: NVLink, XGMI, RoCE.
  • Architecture: LLMs, batching, quantization.
  • Software engineering: Go and/or Python; gRPC.

🎁 Benefits

  • Work on cutting-edge AI/cloud tech with an ownership mindset.
  • Strong career development: conferences, training, LinkedIn Learning.
  • Global benefits and well-being programs, flexible time off.
  • Competitive compensation and equity programs.
  • Inclusive, equal-opportunity employer.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’