Related skills
grpc python microservices distributed systems goπ Description
- Technical leadership on data plane components for large generative AI models.
- Design scalable, multi-tenant AI inference cloud with high availability.
- Implement distributed inference hosting with tensor/data parallelism.
- Cross-functional collaboration with PMs, customers, and other engineers.
- Mentor junior engineers and drive technical excellence.
- Operate high-scale services with observability and SLOs.
π― Requirements
- Distributed systems: microservices, messaging, databases, IaC.
- AI/ML domain knowledge: hosting LLMs with vLLM, SGLang, Modular.
- Inference frameworks: llm-d, NVIDIA Dynamo, Ray Serve.
- GPU hardware & interconnects: NVLink, XGMI, RoCE.
- Architecture: LLMs, batching, quantization.
- Software engineering: Go and/or Python; gRPC.
π Benefits
- Work on cutting-edge AI/cloud tech with an ownership mindset.
- Strong career development: conferences, training, LinkedIn Learning.
- Global benefits and well-being programs, flexible time off.
- Competitive compensation and equity programs.
- Inclusive, equal-opportunity employer.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!