Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
rust java kubernetes go scala๐ Description
- Design infra for large-scale experiments and training (HPC, GPU cloud)
- Build job submission, scheduling, and monitoring abstractions
- Create tooling to boost research productivity and reduce iteration time
- Shape long-term infra roadmap for Databricks AI Research
- Mentor engineers on compute, infra, and AI systems
๐ฏ Requirements
- BS/MS or PhD in Computer Science or related field
- 5+ years of software engineering, with large-scale distributed systems or infra
- Deep experience building/operating distributed systems, data pipelines, or backend services with GPUs/clusters
- Proficient in C++, Rust, Go, Java, or Scala
- Built or contributed to cluster schedulers or large-scale job orchestration (Kubernetes, Slurm, Ray)
- Understanding of modern ML training/inference workflows
- Fast, pragmatic delivery with focus on reliability and ops
- Strong communication between researchers and engineers
๐ Benefits
- Comprehensive benefits and perks
- Region-specific details via link
- Commitment to diversity and inclusion
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!