Researcher, Artifacts - Agent Post-Training

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

machine learning data pipelines model training reinforcement learning multi-agent systems

๐Ÿ“‹ Description

  • Design experiments to improve agentic model behavior for software/plugins.
  • Own end-to-end post-training stack improvements: RL, data pipelines, graders, evals.
  • Build evals/environments to reveal model failures and convert them into training data or fixes.
  • Partner with Codex and ChatGPT teams to translate user signals into model improvements.
  • Work on early-training and alignment interventions: data mixtures, objectives, eval loops.
  • Improve large-scale training and launch machinery: velocity, reliability, observability, cost, latency.

๐ŸŽฏ Requirements

  • Strong fundamentals in ML, software, systems, or statistics; quick to learn new areas.
  • Hands-on with LLMs, RL, RLHF/RLAIF, post-training, evals, graders, synthetic data, production ML.
  • Thrives on open-ended problems with noisy signals; blends research taste and engineering.
  • Cares about product impact and practical model behavior.
  • Turn vague problems into experiments: hypothesize, build pipeline, run, analyze, decide.
  • Comfortable across research, product, infra, data, evals, safety; communicates clearly.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’