Related skills
rust linux python distributed systems asynchronousπ Description
- Work across our Python and Rust stack
- Design, build, and maintain software to orchestrate ML workloads
- Profile and optimize software for frontier-scale orchestration
- Improve reliability and fault tolerance for long-running jobs
- Debug distributed systems across large clusters
- Respond to evolving ML needs to enable researchers
π― Requirements
- Experience developing distributed systems (not just operating)
- Understand large systems' behavior and failure at scale
- Care deeply about performance, correctness, and reliability
- Proficient in Python and Rust (or C++)
- Strong Linux knowledge; debugging, perf analysis, memory profiling
- Comfortable with asynchronous and concurrent systems
π Benefits
- Hybrid work model: 3 days in the office per week
- Relocation assistance for new employees
π Relocation support
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!