Researcher, Context - Agent Post-Training

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

llms rlhf rl rlaif evals

πŸ“‹ Description

  • Design and run experiments to scale context compute.
  • Own end-to-end post-training stack improvements (RL, data pipelines).
  • Build evals and environments to surface failures and guide fixes.
  • Partner with Codex and ChatGPT teams to translate signals into model improvements.
  • Work on early-training and alignment interventions (data, eval loops).
  • Decide which integrations and fixes are ready for major runs.
  • Improve large-scale training machinery: velocity, reliability, cost, latency.
  • Debug hard failures in shipped or near-shipped models.

🎯 Requirements

  • Strong fundamentals in ML, software, systems, or statistics.
  • Hands-on with LLMs, RL, RLHF/RLAIF, post-training, evals, graders, synthetic data.
  • Experience with production ML systems and tooling for model training.
  • Able to work across research, product, infrastructure, data, evals, safety.
  • Comfort turning vague problems into concrete experiments and pipelines.
  • Clear communication with cross-functional teams.

🎁 Benefits

  • Hybrid work with flexible collaboration.
  • Work on frontier AI models with impact.
  • Learning and growth in ML research and systems.
  • Equal opportunity employer with inclusive policies.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’