This job is no longer available

The job listing you are looking has expired.
Please browse our latest remote jobs.

See open jobs →
← Back to all jobs

AI Engineer & Researcher, Inference - San Francisco, USA

Added
22 days ago
Type
Full time
Salary
Not Specified

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Save job

Job Overview

\n

Speechify is seeking an AI Engineer & Researcher, Inference to join our San Francisco, USA team. This role blends research and engineering to build and optimize high-performance inference systems for speech models. You will collaborate with the ML research and product teams to push the state of the art in model latency, memory efficiency, and reliable deployment to production. The ideal candidate is proficient in modern ML frameworks and fluent in turning research insights into production-ready inference tooling.

\n

Responsibilities

\n
    \n
  • Design, implement, and optimize scalable inference pipelines for speech models.
  • \n
  • Conduct applied research on efficient inference techniques (e.g., quantization, pruning, distillation) and integrate them into production systems.
  • \n
  • Evaluate models for latency, accuracy, and memory usage; develop benchmarks and monitor performance in production.
  • \n
  • Collaborate with ML researchers and product teams to deploy reliable, real-time speech solutions.
  • \n
  • Build tooling to automate experiments, track results, and support model lifecycle management.
  • \n
\n

Requirements

\n
    \n
  • Strong background in machine learning and systems, with a focus on inference for deep models.
  • \n
  • Proficiency in Python and at least one ML framework (PyTorch or TensorFlow).
  • \n
  • Experience with C++/CUDA and performance-oriented programming is a plus.
  • \n
  • Familiarity with distributed systems, GPUs, and cloud-based deployments.
  • \n
  • Excellent collaboration skills and ability to translate research ideas into production-ready solutions.
  • \n
\n

Nice to have

\n
    \n
  • Experience in speech recognition, speech synthesis, or related audio processing domains.
  • \n
  • Publications or open-source contributions in ML research topics.
  • \n
  • Experience with quantization, pruning, distillation, or other model compression techniques.
  • \n

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to On site Engineering Jobs. Just set your preferences and Job Copilot will do the rest—finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →