Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills

Tailors your resume and cover letter automatically

Works 24/7—so you don't have to

Lead the Model Routing & Inference team, owning the inference platform powering AI interactions.
Own the full inference path: latency, reliability, and cost optimization at scale.
Set technical direction for cluster management, inference optimization, and traffic egress.
Build a platform enabling fast product work without provider complexity.
Lead engineers, set direction, and balance latency, cost, reliability, and UX.
Drive projects: inference gateway, model selection, GPU utilization, and routing control.

Led teams building high-throughput, low-latency distributed systems (inference, routing).
Reason about cost/performance tradeoffs at scale (GPU, capacity planning) with incomplete info.
Strong software fundamentals; shipped production systems handling millions of requests.
Experience with model serving frameworks (vLLM, TensorRT-LLM, TGI) and load balancing.
Experience hiring and growing teams; coaching and mentorship.
Ability to make calls balancing reliability, cost, latency, and UX in ambiguous situations.

Engineering Manager, Model Routing & Inference

Meet JobCopilot: Your Personal AI Job Hunter