Research Engineer, Evaluations

Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

cloud sql python ml benchmarking

πŸ“‹ Description

  • Own end-to-end evaluation across accuracy, latency, and metrics.
  • Build and maintain benchmarking pipelines against competitors.
  • Design experiments to measure the impact of model changes.
  • Onboard, curate, and maintain evaluation datasets (public and internal).
  • Create evaluation subsets to stress-test capabilities and edge cases.
  • Collaborate with research and engineering teams to align with customer needs.

🎯 Requirements

  • ML fundamentals: understand model training and evaluation.
  • Strong Python skills; write evaluation scripts and data pipelines.
  • SQL and cloud infrastructure experience.
  • Metric intuition: define metrics capturing real-world performance.
  • Voice agent stack familiarity: VAD, ASR, turn detection, LLM, TTS.
  • Overlap with Eastern US Time Zone: 3-4 hours required.

🎁 Benefits

  • Fully remote team.
  • Shape product through research.
  • Pay transparency and pay equity.
  • Collaborative, diverse team.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’