Own performance profiling and optimization for ML training with Nsight, PyTorch Profiler.
Distributed training pipelines with PyTorch Distributed.
Develop high-performance GPU kernels in Triton or CUDA.
Optimize data loading pipelines to maximize training throughput.
Collaborate with researchers to scale large models.

🎯 Requirements

Bachelor’s, Master’s, or PhD in CS/CE or related field.
Strong proficiency in Python.
Extensive PyTorch experience.
Experience optimizing training and inference; strong ML concepts.
Excellent analytical and problem-solving skills.

🎁 Benefits

Medical, dental, and vision coverage.
401(k) with company match.
Health Savings Accounts.
Life insurance.
Pet insurance.
Inclusive and diverse culture.

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot