Senior Engineer 2: Inference Data Plane

Added
27 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

golang python distributed systems llm ray serve

πŸ“‹ Description

  • Technical Leadership: Lead design and delivery of data plane components for large generative AI models.
  • System Design: Architect high-scale, multi-tenant AI inference cloud components.
  • Performance Optimization: Optimize distributed inference with tensor/data parallelism and caching.
  • Collaboration: Align roadmaps with PMs, customer teams, and engineers.
  • Mentorship: Mentor junior engineers and foster technical excellence.
  • Operational Excellence: Maintain high-scale services with observability and SLOs.

🎯 Requirements

  • Distributed Systems Expertise: Distributed systems with microservices, messaging, databases, and infra as code.
  • AI/ML Domain Knowledge: Hosting large language or multimodal models with inference engines (vLLM, SGLang, Modular).
  • Inference Frameworks: Distributed inference serving frameworks (llm-d, NVIDIA Dynamo, Ray Serve).
  • Hardware & Interconnects: GPU optimization and interconnects (NVlink, XGMI, RoCE).
  • Architecture Proficiency: LLM architectures and optimizations (batching, quantization).
  • Software Engineering: GoLang or Python expert; experience with gRPC.
  • Cloud Operations: Shaping customer-facing software in high-scale environments.
  • Open Source Mindset: Building with open-source software.

🎁 Benefits

  • Innovate with purpose; build cloud/AI for builders.
  • Career development resources, conferences, training, LinkedIn Learning.
  • Well-being programs and global benefits vary by region.
  • Competitive compensation with equity and ESPP options.
  • Equal-opportunity employer and inclusive culture.
  • Employee assistance, local meetups, and growth opportunities.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’