Staff Back End Engineer, Evals - Hazel AI

Added
12 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

sql dbt apis observability llm

πŸ“‹ Description

  • Design and build Hazel's evals platform end-to-end (scoring, datasets, CI/CD)
  • Build observability for AI quality: hallucinations, accuracy, latency, and cost signals
  • Architect data pipelines turning advisor interactions into evaluation datasets with privacy controls
  • Build and steward golden datasets with SMEs and advisors to define eval criteria
  • Develop LLM verification agents to catch hallucinations, computational errors, and compliance violations
  • Integrate evals into deployment pipelines to run regression tests before shipping

🎯 Requirements

  • 8+ years of engineering experience, with at least 2 years in evaluation infra or ML platforms
  • Deep familiarity with AI evaluation methods (RAG, docs, model assessment, human eval)
  • Experience designing and curating golden datasets β€” sampling, inter-rater agreement, versioning, edge cases
  • Comfort across the stack: data engineering (SQL, dbt, warehouses) and API/backend integration
  • Strong communication; translate domain needs into precise, automatable eval criteria
  • Bias toward shipping; build tools engineers actually want to use

🎁 Benefits

  • Hybrid work schedule for most positions
  • Office spaces in Culver City, SF, and Dallas
  • Competitive pay and equity for eligible positions
  • Premium healthcare, dental, and vision insurance plans
  • 401k with a 4% match and immediate vesting
  • One month work from anywhere policy
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’