Related skills
optimization python pytorch llm learning_rate_schedules๐ Description
- Own late-stage training decisions: data mix, annealing, context length.
- Bridge pre-training and post-training; no strict research vs engineering split.
- Design high-quality data mixtures for late-stage and annealing runs.
- Drive coding, math, and reasoning capability via data strategies.
- Develop synthetic data pipelines for scalable training signals.
- Optimize multi-stage schedules and compute/data balance.
๐ฏ Requirements
- LLM training pipeline end-to-end (pre- to post-training).
- Experience with continual pre-training, annealing, or late-stage data mixing.
- Strong data-quality intuition; ability to filter and curate at scale.
- Experience with synthetic data pipelines for capability improvement.
- Proficiency in Python and PyTorch; debug distributed training.
- Strong fundamentals in optimization, statistics, ML theory.
- Track record of original contributions: publications, OSS, etc.
- Comfortable in ambiguous, fast-moving environments.
๐ Benefits
- Small, selective team; research and product move together.
- Compute is not a constraint; thousands of GPUs available.
- Environment rewards speed, autonomy, and depth.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!