Staff AI Engineer, Model Post-Training and Alignment

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

llms rlaif dpo vllm post-training

πŸ“‹ Description

  • Lead and execute post-training pipelines for LLMs (supervised fine-tuning, RL).
  • Design advanced training paradigms such as DPO and GRPO.
  • Develop domain-specific data recipes, curation, and augmentation.
  • Post-train specialized small models from scratch: architecture, data, optimization.
  • Build and refine Reward Models to support alignment and downstream optimization.
  • Improve inference efficiency with low-latency serving (vLLM, SGLang).

🎯 Requirements

  • Bachelor's in CS/AI/ML or related; 8+ years industry experience.
  • Strong hands-on experience with post-training pipelines for large models.
  • Deep familiarity with DPO, GRPO, and RL-based post-training methods.
  • Experience training specialized small models from scratch.
  • Solid understanding of RL fundamentals and alignment applications.
  • Experience deploying models in low-latency production (vLLM, SGLang).

🎁 Benefits

  • Competitive total compensation package
  • L&D programs and education subsidy
  • Team building programs and company events
  • Wellness and meal allowances
  • Healthcare schemes for employees and dependants
  • Additional benefits disclosed during the process
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’