Related skills
python pytorch reinforcement learning distributed training cudaπ Description
- Design RL environments and coding tasks.
- Build reward signals and verifiers for 'good code'.
- Run training experiments on frontier models.
- Diagnose why models improve or fail on software tasks.
- Improve speed and reliability of end-to-end pipelines.
- Collaborate with alignment and frontier red teams.
π― Requirements
- Strong software engineering skills with deep Python expertise.
- Experience with async/concurrent programming.
- Ability to own systems end-to-end and debug across the stack.
- Balance research exploration with engineering implementation and experimental design.
- Focus on code quality, testing, and performance.
- Passion for safe, beneficial AI and responsible deployment.
π Benefits
- Competitive compensation and benefits.
- Optional equity donation matching.
- Generous vacation and parental leave.
- Flexible working hours.
- Office space in San Francisco for in-person collaboration.
π Visa sponsorship
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!