Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills

Tailors your resume and cover letter automatically

Works 24/7—so you don't have to

Design and build Hazel's evals platform end-to-end (scoring, datasets, CI/CD)
Build observability for AI quality: hallucinations, accuracy, latency, and cost signals
Architect data pipelines turning advisor interactions into evaluation datasets with privacy controls
Build and steward golden datasets with SMEs and advisors to define eval criteria
Develop LLM verification agents to catch hallucinations, computational errors, and compliance violations
Integrate evals into deployment pipelines to run regression tests before shipping

8+ years of engineering experience, with at least 2 years in evaluation infra or ML platforms
Deep familiarity with AI evaluation methods (RAG, docs, model assessment, human eval)
Experience designing and curating golden datasets — sampling, inter-rater agreement, versioning, edge cases
Comfort across the stack: data engineering (SQL, dbt, warehouses) and API/backend integration
Strong communication; translate domain needs into precise, automatable eval criteria
Bias toward shipping; build tools engineers actually want to use

Staff Back End Engineer, Evals - Hazel AI

Meet JobCopilot: Your Personal AI Job Hunter