Added
1 day ago
Location
Type
Full time
Salary
Salary not provided
Related skills
python tensorflow pytorch cuda tensorrt๐ Description
- Develop distributed training pipelines for large datasets and models
- Build low-latency inference pipelines for real-time predictions
- Develop libraries to improve the performance of ML frameworks
- Maximize training/inference performance using GPU acceleration
- Design scalable model frameworks for high-volume trading data
- Collaborate with researchers to automate ML experiments, hyperparameter tuning, and retraining
๐ฏ Requirements
- 5+ years of experience in ML focusing on training or inference systems
- Real-time, low-latency ML pipelines in high-performance environments
- Strong engineering skills, including Python, CUDA, or C++
- Knowledge of ML frameworks such as PyTorch, TensorFlow, or JAX
- GPU programming for training/inference acceleration (CuDNN, TensorRT)
- Experience with distributed training for scaling ML workloads (Horovod, NCCL)
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!