Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
machine learning data pipelines model training reinforcement learning multi-agent systems๐ Description
- Design experiments to improve agentic model behavior for software/plugins.
- Own end-to-end post-training stack improvements: RL, data pipelines, graders, evals.
- Build evals/environments to reveal model failures and convert them into training data or fixes.
- Partner with Codex and ChatGPT teams to translate user signals into model improvements.
- Work on early-training and alignment interventions: data mixtures, objectives, eval loops.
- Improve large-scale training and launch machinery: velocity, reliability, observability, cost, latency.
๐ฏ Requirements
- Strong fundamentals in ML, software, systems, or statistics; quick to learn new areas.
- Hands-on with LLMs, RL, RLHF/RLAIF, post-training, evals, graders, synthetic data, production ML.
- Thrives on open-ended problems with noisy signals; blends research taste and engineering.
- Cares about product impact and practical model behavior.
- Turn vague problems into experiments: hypothesize, build pipeline, run, analyze, decide.
- Comfortable across research, product, infra, data, evals, safety; communicates clearly.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!