Added
6 hours ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
rlhf fine-tuning dpo llama mistralπ Description
- Engage with frontier lab research teams in client scoping meetings.
- Collaborate with Applied Research to publish findings from projects.
- Fine-tune open-weight models to validate data methodology.
- Reason scientifically about data strategy in real time.
- Work across multiple client engagements with rapid feedback loops.
π― Requirements
- MS or PhD in ML, NLP, CS, or related quantitative field.
- Hands-on fine-tuning of open-weight LLMs (Llama, Mistral, Qwen).
- Strong understanding of LLM training pipelines and data quality.
- Experience designing rigorous experiments with statistics.
- Ability to deliver experiments in days, not months.
- Strong written and verbal communication.
- Strongly preferred: frontier AI lab or applied ML startup experience.
- Strongly preferred: evaluation benchmarks and RLHF experience.
π Benefits
- Location: SF or Wroclaw; hybrid with 2 days in office.
- Small, high-leverage team with direct client impact.
- Hybrid work model with fast-paced delivery and short sprints.
- Growth: career advancement tied to impact.
- Opportunity to publish research and contribute to benchmarks.
- Research time protected (25-30% of team capacity).
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!