Anthropic

126 jobs posted

View company profile →

Please mention that you found this job on empllo.com. Thanks & good luck!

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills
Tailors your resume and cover letter automatically
Works 24/7—so you don't have to

Activate JobCopilot

Follow us on LinkedIn!

Software Engineer, Safeguards Evals

Added

17 minutes ago

Location

🇺🇸 San Francisco

Type

Full time

Salary

Upgrade to Premium to se...

Related skills

python distributed systems data pipelines llms synthetic data generation

📋 Description

Build and own the evaluation harness for agentic investigations
Construct eval datasets representing real-world misuse across harm areas
Measure agent performance end-to-end and drive improvements on hard harm areas
Analyze coverage to identify measurement gaps and keep evals high-signal
Productionize successful research into regression and release pipelines
Build tooling that enables policy experts to author, run, and iterate on evaluations

🎯 Requirements

Proficiency in Python and comfort across the stack
Experience building and maintaining data pipelines
Experience with LLMs and agentic systems with tool use and multi-step reasoning
Strong data analysis skills and ability to derive insights from large datasets
Ability to move between research prototyping and production-quality code
Ability to translate ambiguous problems into concrete, testable experiments

🎁 Benefits

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Collaborative office space

🛃 Visa sponsorship

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot