Senior Research Engineer, Post-training & Evaluation

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

python pytorch hugging face transformers deepspeed vllm

πŸ“‹ Description

  • Architect and maintain the Reddit Benchmark evaluation suite across Safety and Reddit knowledge.
  • Build scalable SFT pipelines with distributed training for instruction tuning.
  • Develop Model-as-a-Judge: automated evaluation pipelines using strong models.
  • Execute synthetic data strategies to improve model generalization.
  • Collaborate with Safety Engineering to translate safety policies into tests in CI/CD.
  • Debug post-training instability; inspect loss curves and evaluation logs.

🎯 Requirements

  • 4+ years of professional ML engineering experience with LLM fine-tuning or evaluation.
  • Fluency in Python and PyTorch with Hugging Face Transformers, vLLM, or lm-eval-harness.
  • Deep understanding of Instruction Tuning (SFT) and data quality impact.
  • Experience building Evaluation Pipelines and domain-specific benchmarks.
  • Familiarity with distributed training (FSDP/DeepSpeed) for fine-tuning.
  • Strong data engineering skills for curating and cleaning instruction datasets.

🎁 Benefits

  • Comprehensive healthcare benefits and income replacement.
  • 401k with employer match.
  • Global benefits for lifestyle and development.
  • Family planning support.
  • Gender-affirming care.
  • Mental health support, flexible vacation, and volunteer time off.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’