Design and run experiments to improve agentic model behavior for computer use (desktop/browser).
Own end-to-end post-training improvements: RL, data pipelines, graders, rewards, evals, diagnostics.
Build evals/environments to expose model failures and convert them into training data or fixes.
Partner with Codex/ChatGPT teams to translate user needs into model improvements.
Work on early training and alignment interventions (data mixtures, objectives, synthetic data, eval loops).
Decide which integrations and fixes are ready for major model runs.

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot