About Speechify
\n
Speechify is an AI-powered reading platform that converts text into natural-sounding speech. We build scalable AI systems to deliver fast, personalized listening experiences.
\n
About the Role
\n
As an AI Engineer & Researcher, Inference in Austin, you will design, implement, and optimize high-performance AI inference pipelines for Speechify's models. You will prototype new techniques to improve latency and throughput and collaborate with ML researchers and software engineers to bring models into production.
\n
Responsibilities
\n
\n- Design, implement, and optimize AI inference pipelines for speech/text models
\n- Prototype quantization, pruning, distillation, and other techniques to improve latency
\n- Benchmark models, run experiments, analyze results, and communicate findings
\n- Collaborate with ML and software teams to deploy, monitor, and maintain production inference systems
\n- Mentor teammates and contribute to code reviews and best practices
\n
\n
Qualifications
\n
\n- Strong background in machine learning, AI research, and inference optimization
\n- Proficiency with Python and ML frameworks (PyTorch, TensorFlow)
\n- Experience with GPUs, CUDA, and scalable ML systems
\n- Familiarity with MLOps, experiments tooling, and model deployment
\n- MS/PhD in CS/ML or related field; 3+ years of relevant experience
\n- Excellent communication and collaboration skills
\n
\n
Nice to Have
\n
\n- Experience in speech, NLP, or related domains
\n
\n
Benefits
\n
\n- Competitive salary and equity
\n- Comprehensive health, dental, and vision insurance
\n- 401(k) with company match
\n- Flexible work schedule and generous paid time off
\n
\n
Location
\n
Austin, USA (onsite)