Related skills
production python monitoring evaluation llm📋 Description
- Define the quality bar with eval rubrics and rollout criteria.
- Build with real-world constraints: production code, monitoring, tests.
- Own features end to end from problem framing to rollout.
- Debug failures across stack: data, infra, model, prompt logic.
- Design and implement systems: retrieval pipelines, agents, or hybrid patterns.
- Work across functions to ship features that actually stick.
🎯 Requirements
- Strong research instincts; define working evals for AI features.
- Solid Python engineering; production stacks and tests.
- LLM understanding: prompting, fine-tuning, tooling, eval.
- Systems mindset: interfaces, data contracts, failure modes, rollout plans.
- Practical bias: ship first; what survives matters.
- Ownership: initiative, clear communication, quality focus.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!