Build performance models translating microbenchmarks into cost-to-serve estimates.
Analyze inference workloads end-to-end across apps, models, and fleet.
Enhance tooling to identify bottlenecks across layers for latency and throughput.
Partner with other teams to turn performance insights into concrete improvements and forecasts.

🎯 Requirements

Deep expertise in performance profiling, benchmarking, analysis, and optimization.
Reason from first principles about distributed systems, model inference, and hardware efficiency.
Work across abstraction layers from application behavior to kernels, accelerators, networking.
Collaborate with engineering and research teams to improve production systems.

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot