Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
python pytorch hugging face transformers deepspeed vllmπ Description
- Architect and maintain the Reddit Benchmark evaluation suite across Safety and Reddit knowledge.
- Build scalable SFT pipelines with distributed training for instruction tuning.
- Develop Model-as-a-Judge: automated evaluation pipelines using strong models.
- Execute synthetic data strategies to improve model generalization.
- Collaborate with Safety Engineering to translate safety policies into tests in CI/CD.
- Debug post-training instability; inspect loss curves and evaluation logs.
π― Requirements
- 4+ years of professional ML engineering experience with LLM fine-tuning or evaluation.
- Fluency in Python and PyTorch with Hugging Face Transformers, vLLM, or lm-eval-harness.
- Deep understanding of Instruction Tuning (SFT) and data quality impact.
- Experience building Evaluation Pipelines and domain-specific benchmarks.
- Familiarity with distributed training (FSDP/DeepSpeed) for fine-tuning.
- Strong data engineering skills for curating and cleaning instruction datasets.
π Benefits
- Comprehensive healthcare benefits and income replacement.
- 401k with employer match.
- Global benefits for lifestyle and development.
- Family planning support.
- Gender-affirming care.
- Mental health support, flexible vacation, and volunteer time off.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!