Added
41 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

postgresql python kubernetes fastapi opensearch

πŸ“‹ Description

  • Production RAG: indexing, retrieval, hybrid search, reranking, citations
  • Context Graph: entity resolution, linking, provenance; graph + vector retrieval
  • LLM orchestration: tool/function calling, structured outputs, routing across model tiers
  • GPU/inference cost optimization: batching, caching, quantization, autoscaling
  • Safety + compliance: PII/PHI handling, redaction, audit logs, deterministic replay
  • LLMOps: eval harness (golden sets, regression, adversarial)

🎯 Requirements

  • 5+ years building production systems; 2+ years hands-on LLMs/RAG
  • Proven RAG experience (embeddings, vector DBs, hybrid search, reranking, eval)
  • Strong backend/distributed systems + observability
  • Track record shipping in high-stakes environments with auditability/correctness
  • Knowledge graph / entity resolution / provenance systems
  • GPU inference optimization (vLLM/TGI/TensorRT-LLM, quantization AWQ/GPTQ, batching)
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’