ML Infrastructure Engineer

Added
12 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

docker python kubernetes pytorch deepspeed

πŸ“‹ Description

  • Create flexible, performant ML infrastructure
  • Design ML cloud infra for massive-scale modeling and analytics
  • Support model exploration, hyperparameter optimization, pretraining and fine-tuning
  • Build scalable distributed training pipelines (sharding, cross-GPU)
  • Create, operate, and maintain ML platforms across the model lifecycle
  • Make architecture decisions balancing performance, cost, reliability, and scalability

🎯 Requirements

  • Bachelor's degree in Computer Science, Electrical Engineering, or related
  • 5+ years in software engineering, large-scale data infrastructure, or systems ML
  • Extensive proficiency in Python
  • Familiarity with PyTorch
  • Experience with distributed-training frameworks (FSDP, DeepSpeed, Megatron-LM, Ray)
  • Experience building or optimizing ML training pipelines for transformers or large neural-network models

🎁 Benefits

  • Collaborative, high-impact environment
  • Comprehensive benefits package
  • 401(k) plan with matching contributions
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’