Senior Software Engineer II - Applied AI and Evaluations (Remote Eligible)
Related skills
python databricks prompt engineering rag llm๐ Description
- Own agent quality end-to-end across SmartAssist's orchestrator and subagents
- Identify failure modes across quality dimensions and prioritize fixes
- Drive quality improvements through prompt engineering, context engineering, and RAG tuning
- Extend and mature our evaluation framework with scorers, datasets, and regression gates
- Close the feedback loop with measurable, attributable quality signals
- Collaborate with the Agent Architecture lead to flag quality problems requiring prompt/context solutions vs structural fixes
๐ฏ Requirements
- 8+ years software engineering with at least 2 years with LLMs in production
- Deep, hands-on prompt engineering and context engineering
- Strong knowledge of RAG architectures and retrieval evaluation
- Experience building or extending LLM evaluation frameworks
- Strong Python skills; comfortable with Databricks/Delta tables
- BS or MS in CS or related field, or equivalent experience; legally eligible to work in the U.S.
๐ Benefits
- Employer subsidized medical/vision/dental
- 401k match
- Monthly stipend
- Flexible Time Away Program and Sick Time Off
- Udemy online courses for professional development
- Teleworking options from any registered location in the U.S.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!