Related skills
data pipeline python pytorch cuda triton📋 Description
- Own performance profiling and optimization for ML training with Nsight, PyTorch Profiler.
- Distributed training pipelines with PyTorch Distributed.
- Develop high-performance GPU kernels in Triton or CUDA.
- Optimize data loading pipelines to maximize training throughput.
- Collaborate with researchers to scale large models.
🎯 Requirements
- Bachelor’s, Master’s, or PhD in CS/CE or related field.
- Strong proficiency in Python.
- Extensive PyTorch experience.
- Experience optimizing training and inference; strong ML concepts.
- Excellent analytical and problem-solving skills.
🎁 Benefits
- Medical, dental, and vision coverage.
- 401(k) with company match.
- Health Savings Accounts.
- Life insurance.
- Pet insurance.
- Inclusive and diverse culture.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!