Related skills
rust python kubernetes go vllmπ Description
- Design and build multi-tenant inference platform components for Atlas integration.
- Collaborate with AI engineers to productionize inference for embedding models.
- Contribute to latency-aware routing, model versioning, health monitoring.
- Improve performance, autoscaling, GPU utilization in a cloud-native env.
- Work across product, infra, and ML teams to meet Atlas scale and latency.
- Gain hands-on with vLLM and Kubernetes for container orchestration.
π― Requirements
- 2+ years of experience building backend or infrastructure systems at scale.
- Strong software engineering skills in Go, Rust, Python, or C++, with emphasis on performance and reliability.
- Experienced in cloud-native architectures, distributed systems, and multi-tenant design.
- Familiar with ML model serving and inference runtimes, even if not deploying models.
- Knowledge of vector search systems (Faiss, HNSW) is a plus.
- Comfortable working across ML, backend, and platform teams.
- Motivated to work on systems integrated into MongoDB Atlas used by thousands of developers.
π Benefits
- Equity participation and employee stock purchase program.
- Flexible paid time off.
- 20 weeks fully-paid parental leave.
- Fertility and adoption assistance.
- 401(k) retirement plan.
- Mental health counseling and support.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!