Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
ai rlhf dpo grpo agent_evaluationπ Description
- Research AI agent capabilities focusing on safety, risk factors, and benchmarking methods.
- Design harnesses to test agents' tendency to harmful actions under pressure or manipulation.
- Design exploits and mitigations for failure modes as agents gain affordances like coding and web use.
- Characterize and design mitigations for risks in multi-agent systems.
π― Requirements
- Commitment to safe, secure, and trustworthy AI deployments.
- Collaborative technical research; build evaluation harnesses and prototypes.
- Experience with post-training and RL techniques: RLHF, DPO, GRPO.
- Published ML research, especially in generative AI.
- At least three years addressing sophisticated ML problems.
- Strong written and verbal communication for cross-functional teams.
π Benefits
- Comprehensive health, dental, and vision coverage.
- Retirement benefits.
- Learning and development stipend.
- Generous PTO.
- Commuter stipend may be available.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!