Related skills
postgresql python kubernetes fastapi opensearchπ Description
- Production RAG: indexing, retrieval, hybrid search, reranking, citations
- Context Graph: entity resolution, linking, provenance; graph + vector retrieval
- LLM orchestration: tool/function calling, structured outputs, routing across model tiers
- GPU/inference cost optimization: batching, caching, quantization, autoscaling
- Safety + compliance: PII/PHI handling, redaction, audit logs, deterministic replay
- LLMOps: eval harness (golden sets, regression, adversarial)
π― Requirements
- 5+ years building production systems; 2+ years hands-on LLMs/RAG
- Proven RAG experience (embeddings, vector DBs, hybrid search, reranking, eval)
- Strong backend/distributed systems + observability
- Track record shipping in high-stakes environments with auditability/correctness
- Knowledge graph / entity resolution / provenance systems
- GPU inference optimization (vLLM/TGI/TensorRT-LLM, quantization AWQ/GPTQ, batching)
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!