Related skills
distributed systems prompt engineering llm rag pipelines agent orchestration📋 Description
- Design and build distributed, fault-tolerant infrastructure for AI agent capabilities (memory/state).
- Own architecture decisions for agent infra services end-to-end, from design to production.
- Build evaluation, observability, and reliability tooling for agents.
- Collaborate with product engineers and tax data teams to translate workflows into agent primitives.
- Drive engineering best practices: code quality, testing, design, and operations.
- Identify and fix performance, reliability, and scalability bottlenecks as systems grow.
🎯 Requirements
- 3+ years building and operating distributed production-grade systems.
- Designed services that are scalable, resilient, and fault-tolerant.
- Built and deployed AI agents or LLM-powered systems in production.
- Familiar with prompt engineering, eval frameworks, RAG pipelines, tool calling, and agent orchestration.
- Full software development lifecycle experience: design, review, testing, deployment, on-call.
- Able to lead technical decisions and operate in ambiguity.
🎁 Benefits
- Equity upside as an early-stage startup.
- Daily lunch and snacks in SF.
- Medical, dental, and vision insurance fully covered.
- One Medical membership and flexible sick benefits.
- Annual learning stipend for courses and conferences.
- Hybrid work with SF/NYC/SLC hubs.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!