Related skills
machine learning rag metrics evaluation llm📋 Description
- Design, prepare, and curate high-quality evaluation datasets with defensible methodology.
- Define criteria for dataset construction with statistical rigor and reproducibility.
- Develop new metrics and evaluation frameworks to measure model performance.
- Evaluate LLMs and pre-trained models using carefully chosen datasets and metrics.
- Build scalable pipelines for training, fine-tuning, and benchmarking models.
- Contribute to projects involving fine-tuning, retrieval-augmented generation (RAG), and adaptation methods.
🎯 Requirements
- Education: Master’s degree (PhD preferred) in CS, Statistics, ML, or related field.
- Strong math and stats foundations for ML (probability, linear algebra, optimization).
- End-to-end ML lifecycle: dataset prep, training, evaluation, deployment, monitoring.
- Expertise in evaluation dataset design and metrics creation.
- Experience with LLM evaluation, fine-tuning, and RAG; production pipelines.
- Track record of impact at staff/principal level and setting ML standards.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!