Related skills
grpc python microservices distributed systems goπ Description
- Lead design and delivery of data plane components for large models.
- Architect scalable, multi-tenant AI inference cloud with high availability.
- Optimize distributed inference with tensor parallelism, KV cache, routing.
- Collaborate with PMs and teams to align roadmaps with customer needs.
- Mentor junior engineers to foster technical excellence.
- Operate high-scale services with observability and defined SLOs.
π― Requirements
- Experience in microservices, messaging, databases, and infrastructure as code.
- Hands-on hosting large LLM/multimodal models with engines (vLLM/SGLang/Modular).
- Experience with distributed inference serving (llm-d, Dynamo, Ray Serve).
- GPU optimization and interconnects (NVLink, XGMI, RoCE).
- Knowledge of LLM architectures and optimizations (batching, quantization).
- Proficient in Go or Python; experience with gRPC.
- Experience shipping customer-facing software in high-scale environments.
- Experience building with open-source software.
π Benefits
- Career development resources and growth programs.
- Conference reimbursement and LinkedIn Learning access.
- Well-being benefits and flexible time off.
- Salary guidance, bonuses, and equity compensation.
- Equity grants and Employee Stock Purchase Program.
- DigitalOcean is an equal-opportunity employer.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!