Senior Engineer 2: Inference Optimizations

Added
11 minutes ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

cuda tensorrt rocm moe openai triton

📋 Description

  • Lead benchmarking and performance optimizations for inference engine and GPU kernels.
  • Engineer mem bandwidth and compute utilization across multi-node GPU clusters.
  • Implement cutting-edge optimization techniques (AITER tuning, kernel fusion, MoE routing).
  • Serve as SME on GPUs and software stacks (CUDA, ROCm, TensorRT, OpenAI Triton).
  • Mentor through high-quality code and design reviews to raise the technical bar.
  • Partner with Product and TPMs to translate hardware limits into shipped features.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →