OKX

102 jobs posted

View company profile →

Please mention that you found this job on empllo.com. Thanks & good luck!

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills
Tailors your resume and cover letter automatically
Works 24/7—so you don't have to

Activate JobCopilot

Follow us on LinkedIn!

Staff AI Engineer, Model Post-Training and Alignment

Added

less than a minute ago

Location

🇺🇸 San Jose

Type

Full time

Salary

Upgrade to Premium to se...

Related skills

llms rlaif dpo vllm post-training

📋 Description

Lead and execute post-training pipelines for LLMs (supervised fine-tuning, RL).
Design advanced training paradigms such as DPO and GRPO.
Develop domain-specific data recipes, curation, and augmentation.
Post-train specialized small models from scratch: architecture, data, optimization.
Build and refine Reward Models to support alignment and downstream optimization.
Improve inference efficiency with low-latency serving (vLLM, SGLang).

🎯 Requirements

Bachelor's in CS/AI/ML or related; 8+ years industry experience.
Strong hands-on experience with post-training pipelines for large models.
Deep familiarity with DPO, GRPO, and RL-based post-training methods.
Experience training specialized small models from scratch.
Solid understanding of RL fundamentals and alignment applications.
Experience deploying models in low-latency production (vLLM, SGLang).

🎁 Benefits

Competitive total compensation package
L&D programs and education subsidy
Team building programs and company events
Wellness and meal allowances
Healthcare schemes for employees and dependants
Additional benefits disclosed during the process

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot