Senior Software Engineer, AI Eval

Added
13 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

python typescript evaluation datasets benchmarks

πŸ“‹ Description

  • Design evaluation frameworks to measure AI accuracy, reliability, and edge cases.
  • Create and curate high-quality datasets, golden tests, and benchmarks from prod data.
  • Build automated test harnesses and metrics pipelines to evaluate models and prompts.
  • Partner with applied AI engineers and product leaders to define measurable criteria.
  • Own the evaluation lifecycle for major AI initiatives from experimentation to production.

🎯 Requirements

  • Minimum 5+ years of professional experience in CS, ML, or related field.
  • Experience building testing, evaluation, or data infrastructure for AI/ML systems.
  • Comfort writing production-quality code in Python and TypeScript.
  • Experience with structured and unstructured data, labeling workflows, or data pipelines.
  • Familiarity with ML evaluation techniques (offline/online metrics, regression testing).
  • Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools.

🎁 Benefits

  • Base salary: $240k–$280k USD per year.
  • Equity included.
  • Eligible for benefits and health insurance.
  • See Sentry Benefits page for details.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’