Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
llms rlhf rl rlaif evalsπ Description
- Design and run experiments to scale context compute.
- Own end-to-end post-training stack improvements (RL, data pipelines).
- Build evals and environments to surface failures and guide fixes.
- Partner with Codex and ChatGPT teams to translate signals into model improvements.
- Work on early-training and alignment interventions (data, eval loops).
- Decide which integrations and fixes are ready for major runs.
- Improve large-scale training machinery: velocity, reliability, cost, latency.
- Debug hard failures in shipped or near-shipped models.
π― Requirements
- Strong fundamentals in ML, software, systems, or statistics.
- Hands-on with LLMs, RL, RLHF/RLAIF, post-training, evals, graders, synthetic data.
- Experience with production ML systems and tooling for model training.
- Able to work across research, product, infrastructure, data, evals, safety.
- Comfort turning vague problems into concrete experiments and pipelines.
- Clear communication with cross-functional teams.
π Benefits
- Hybrid work with flexible collaboration.
- Work on frontier AI models with impact.
- Learning and growth in ML research and systems.
- Equal opportunity employer with inclusive policies.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!