Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
react postgresql python typescript langgraph๐ Description
- Own enterprise-scale evaluation infrastructure for AI agents.
- Build unified evaluation platform serving as single source of truth for workflows.
- Develop observability to surface agent behavior and failures.
- Integrate LLMs, tools, retrieval, and logic into reliable agent experiences.
- Create automated pipelines to evaluate models within hours of release.
๐ฏ Requirements
- Multiple years shipping production software in complex systems.
- TypeScript, React, Python, and PostgreSQL.
- Built and deployed LLM-powered features in production.
- Designed evaluation frameworks for model outputs and agent behaviors.
- Worked with vector databases, embeddings, and RAG architectures.
- Experience with evaluation platforms (LangSmith, Langfuse, or similar).
๐ Benefits
- Competitive compensation with meaningful ownership.
- Flexible PTO.
- 401k.
- Wellness benefits, including therapy sessions.
- Technology & Work from Home reimbursement.
- Flexible work schedules.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!