Added
13 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

speech pytorch audio automatic_speech_recognition neural_audio_codecs

πŸ“‹ Description

  • Research, develop, and optimize voice/audio neural models (TTS/ASR).
  • Build prod training/inference pipelines for voice models, focusing on latency.
  • Run end-to-end experiments: data, design, training, evaluation, ablations.
  • Collaborate with ML, product, and infra to deploy voice models in Pi.
  • Explore neural audio codecs, diffusion synthesis, streaming, multimodal models.
  • Develop evaluation frameworks with perceptual metrics and benchmarks.
  • Contribute to Inflection's research culture via publications and reviews.

🎯 Requirements

  • 2-5 years in audio, speech, or multimodal ML (research or engineering).
  • Strong PyTorch proficiency; experience training/debugging large-scale models on GPUs.
  • Solid understanding of audio/speech: spectrograms, mel, vocoders.
  • Able to take ideas from prototype to production; CUDA-aware training loops.
  • Familiar with diffusion, autoregressive codecs, flow-matching for audio.
  • Clear, collaborative communication for cross-functional teams.
  • BS/BA in CS/EE/Linguistics or related; MS/PhD preferred.

🎁 Benefits

  • Medical, dental, and vision coverage.
  • 401k matching.
  • Unlimited PTO.
  • Parental leave and caregiver flexibility.
  • Visa support for international Bay Area employees.

πŸ›ƒ Visa sponsorship

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’