Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
distributed systems benchmarking profiling inference latency📋 Description
- Build performance models translating microbenchmarks into cost-to-serve estimates.
- Analyze inference workloads end-to-end across apps, models, and fleet.
- Enhance tooling to identify bottlenecks across layers for latency and throughput.
- Partner with other teams to turn performance insights into concrete improvements and forecasts.
🎯 Requirements
- Deep expertise in performance profiling, benchmarking, analysis, and optimization.
- Reason from first principles about distributed systems, model inference, and hardware efficiency.
- Work across abstraction layers from application behavior to kernels, accelerators, networking.
- Collaborate with engineering and research teams to improve production systems.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!