Related skills
docker python kubernetes pytorch transformers📋 Description
- Productionize video generation models into robust inference APIs.
- Optimize models for latency and cost via distillation.
- Build real-time inference systems with low latency streaming and observability.
- Prototype fast: ship WebRTC demos and production features.
- Multi-GPU work: run and optimize large model components across GPUs.
- Collaborate with research to deploy models and improve performance.
🎯 Requirements
- 2+ years in ML engineering with shipped systems ownership.
- PyTorch + Python for training and inference.
- Generative models (diffusion/transformers/VAEs) for image/video.
- Improve latency and cost via profiling and memory optimization.
- Production mindset: debugging under load, monitoring, deployment hygiene.
- WebRTC/real-time media delivery; Docker and Kubernetes basics.
🎁 Benefits
- €190,000-€225,000 base salary + bonus.
- Remote in Europe (GMT +/- 2 hours).
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!