Related skills
pytorch llm rlhf deepspeed cptπ Description
- Lead LLM post-training across CPT, SFT, and RL (RLHF).
- Design and curate data for each training stage (datasets, rewards).
- Collaborate with business/product teams to map use cases to training plans.
- Train at scale on mid-to-large GPU clusters with distributed training.
- Build evaluation and verifier pipelines to measure model quality.
- Stay current with post-training research and ship production-ready code.
π― Requirements
- Hands-on LLM post-training experience with RL (RLHF/PPO/DPO).
- Strong ML data engineering; design data-prep plans for business needs.
- Proven large-scale GPU training on mid-to-large hardware; distributed training.
- Strong PyTorch fundamentals; TRL/Accelerate/DeepSpeed/FSDP; vLLM.
- Solid tokenization, attention knowledge and alignment/failure modes.
- Bias toward fast iteration with cross-team communication.
π Benefits
- Health, dental, and vision coverage for you and family (employee 100%).
- 401(k) plan with company matching.
- Paid time off and holidays.
- FSA/HSA and commuter benefits programs.
- Team activity budget.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!