Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
datadog java python go rag๐ Description
- Define AI quality standards; evaluate, validate, and monitor AI agents.
- Build eval infrastructure for LLM outputs, agent behavior, tools, and multi-turns.
- Observability for agent systems: drift, latency, accuracy, and hallucinations.
- Lead agentic test strategy; red-teaming and non-determinism handling.
- Champion developer experience; build internal tooling and fast eval loops.
- Drive AI-first engineering culture; set patterns and education for safe AI features.
๐ฏ Requirements
- Bachelor's degree in Computer Science, Engineering, or equivalent.
- 8+ years building and operating production software systems.
- Experience evaluating or testing LLM-powered features or autonomous agents in production.
- Proficiency with AI-assisted development tools (Claude Code, Cursor, or equivalent).
- Strong backend fundamentals in Python, Java, or Go.
- Experience designing test infrastructure, CI/CD quality gates, or evaluation pipelines at scale.
๐ Benefits
- Comprehensive medical, dental, and vision coverage.
- 401(k) with company match.
- Parental leave and generous PTO.
- Life and disability insurance.
- Learning and development benefits.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!