Related skills
machine learning metrics genai evaluation retrievalπ Description
- Own evaluation strategy for Datadog's AI agent integrations and metrics.
- Build eval datasets, golden traces, and regression harnesses.
- Improve retrieval relevance and tool selection; partner with AI engineers.
- Research agent-data interaction: tool selection, multi-turn eval, grounding.
- Collaborate with Bits teams to share measurement substrate.
- Provide technical leadership via design reviews, mentorship, and talks.
π― Requirements
- BS/MS/PhD in a scientific field, or equivalent.
- 10+ years engineering or applied science; prior tech lead.
- Proven ML/GenAI leadership from research to production.
- Experience with evaluation and measurement of ML systems at scale.
- Strong product mindset; cross-functional leadership.
- Thrives in ambiguity; makes sound technical calls.
π Benefits
- New hire RSUs and ESPP.
- Continuous development, product training, and career pathing.
- Inclusive culture and Community Guilds.
- Global benefits including Spring Health for dependents.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!