Researcher, Agent Post-Training, Personality

Added
1 day ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

training llms rlhf synthetic data post-training

📋 Description

  • Develop a rigorous understanding of what makes an agent a great collaborator.
  • Turn judgments about model behavior into hypotheses, evals, graders, training.
  • Study user signals to understand trust, satisfaction, and outcomes.
  • Work with experts to produce high-quality rollout data and evaluations.
  • Improve reward models and RL objectives for model behaviors.
  • Partner with ChatGPT, Codex, and other teams to validate improvements in real workflows.

🎯 Requirements

  • Think from the user’s perspective and care about how models feel.
  • Translate subjective product questions into falsifiable hypotheses and evaluations.
  • Preserve individuality, adaptability, and behavioral diversity.
  • Shape how frontier agents communicate and build trust.
  • Strong foundations in ML, software, stats, HCI; quick to learn stack.
  • Experience with LLMs, post-training, RL/RLHF, reward modeling.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →