OKX

114 jobs posted

View company profile →

Please mention that you found this job on empllo.com. Thanks & good luck!

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills
Tailors your resume and cover letter automatically
Works 24/7—so you don't have to

Activate JobCopilot

Follow us on LinkedIn!

Staff AI Engineer, Model Post-Training and Alignment

Added

less than a minute ago

Location

🌍 Asia

Type

Full time

Salary

Salary not provided

Related skills

llms rlaif dpo vllm grpo

📋 Description

Lead and execute post-training pipelines for LLMs (supervised finetuning, RL).
Design training paradigms such as DPO and GRPO for alignment.
Develop domain-specific data recipes, curation, and augmentation.
Post-train specialized small models from scratch (architecture, data).
Build and refine Reward Models to support alignment.

🎯 Requirements

Bachelor's in CS/AI or related fields.
8+ years of industry ML/AI experience.
Hands-on post-training pipelines for large models.
Experience with reinforcement learning for alignment.
Familiarity with domain-specific data strategies.
Experience deploying models in production.

🎁 Benefits

Competitive total compensation.
L&D programs and education subsidy.
Team-building programs and company events.
Wellness and meal allowances.
Comprehensive healthcare for employees and dependents.

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot