Research Scientist, Agent Robustness

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

ai rlhf dpo grpo agent_evaluation

πŸ“‹ Description

  • Research AI agent capabilities focusing on safety, risk factors, and benchmarking methods.
  • Design harnesses to test agents' tendency to harmful actions under pressure or manipulation.
  • Design exploits and mitigations for failure modes as agents gain affordances like coding and web use.
  • Characterize and design mitigations for risks in multi-agent systems.

🎯 Requirements

  • Commitment to safe, secure, and trustworthy AI deployments.
  • Collaborative technical research; build evaluation harnesses and prototypes.
  • Experience with post-training and RL techniques: RLHF, DPO, GRPO.
  • Published ML research, especially in generative AI.
  • At least three years addressing sophisticated ML problems.
  • Strong written and verbal communication for cross-functional teams.

🎁 Benefits

  • Comprehensive health, dental, and vision coverage.
  • Retirement benefits.
  • Learning and development stipend.
  • Generous PTO.
  • Commuter stipend may be available.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’