Added
32 minutes ago
Type
Full time
Salary
Salary not provided

Related skills

optimization python pytorch llm learning_rate_schedules

๐Ÿ“‹ Description

  • Own late-stage training decisions: data mix, annealing, context length.
  • Bridge pre-training and post-training; no strict research vs engineering split.
  • Design high-quality data mixtures for late-stage and annealing runs.
  • Drive coding, math, and reasoning capability via data strategies.
  • Develop synthetic data pipelines for scalable training signals.
  • Optimize multi-stage schedules and compute/data balance.

๐ŸŽฏ Requirements

  • LLM training pipeline end-to-end (pre- to post-training).
  • Experience with continual pre-training, annealing, or late-stage data mixing.
  • Strong data-quality intuition; ability to filter and curate at scale.
  • Experience with synthetic data pipelines for capability improvement.
  • Proficiency in Python and PyTorch; debug distributed training.
  • Strong fundamentals in optimization, statistics, ML theory.
  • Track record of original contributions: publications, OSS, etc.
  • Comfortable in ambiguous, fast-moving environments.

๐ŸŽ Benefits

  • Small, selective team; research and product move together.
  • Compute is not a constraint; thousands of GPUs available.
  • Environment rewards speed, autonomy, and depth.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest โ€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs โ†’