Senior Engineer 2: Inference Data Plane

Added
5 hours ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

grpc python microservices distributed systems go

πŸ“‹ Description

  • Lead design and delivery of data plane components for large models.
  • Architect scalable, multi-tenant AI inference cloud with high availability.
  • Optimize distributed inference with tensor parallelism, KV cache, routing.
  • Collaborate with PMs and teams to align roadmaps with customer needs.
  • Mentor junior engineers to foster technical excellence.
  • Operate high-scale services with observability and defined SLOs.

🎯 Requirements

  • Experience in microservices, messaging, databases, and infrastructure as code.
  • Hands-on hosting large LLM/multimodal models with engines (vLLM/SGLang/Modular).
  • Experience with distributed inference serving (llm-d, Dynamo, Ray Serve).
  • GPU optimization and interconnects (NVLink, XGMI, RoCE).
  • Knowledge of LLM architectures and optimizations (batching, quantization).
  • Proficient in Go or Python; experience with gRPC.
  • Experience shipping customer-facing software in high-scale environments.
  • Experience building with open-source software.

🎁 Benefits

  • Career development resources and growth programs.
  • Conference reimbursement and LinkedIn Learning access.
  • Well-being benefits and flexible time off.
  • Salary guidance, bonuses, and equity compensation.
  • Equity grants and Employee Stock Purchase Program.
  • DigitalOcean is an equal-opportunity employer.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’