Sentry

201-500 employees
8 jobs posted

View company profile →

Please mention that you found this job on empllo.com. Thanks & good luck!

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills
Tailors your resume and cover letter automatically
Works 24/7—so you don't have to

Activate JobCopilot

Follow us on LinkedIn!

Senior Software Engineer, AI Eval

Added

13 days ago

Location

🇺🇸 San Francisco

Type

Full time

Salary

Upgrade to Premium to se...

Related skills

python typescript evaluation datasets benchmarks

📋 Description

Design evaluation frameworks to measure AI accuracy, reliability, and edge cases.
Create and curate high-quality datasets, golden tests, and benchmarks from prod data.
Build automated test harnesses and metrics pipelines to evaluate models and prompts.
Partner with applied AI engineers and product leaders to define measurable criteria.
Own the evaluation lifecycle for major AI initiatives from experimentation to production.

🎯 Requirements

Minimum 5+ years of professional experience in CS, ML, or related field.
Experience building testing, evaluation, or data infrastructure for AI/ML systems.
Comfort writing production-quality code in Python and TypeScript.
Experience with structured and unstructured data, labeling workflows, or data pipelines.
Familiarity with ML evaluation techniques (offline/online metrics, regression testing).
Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools.

🎁 Benefits

Base salary: $240k–$280k USD per year.
Equity included.
Eligible for benefits and health insurance.
See Sentry Benefits page for details.

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot