Role overview
Speechify is looking for a Senior Software Engineer, AI Model Serving in Lisbon, Portugal to design, build, and scale our AI model serving platforms. You will work closely with ML researchers and software engineers to deploy reliable, low-latency inference services that power Speechify's AI features.
Responsibilities
- Build and maintain scalable AI model serving infrastructure (inference APIs, model packaging, versioning, monitoring).
- Collaborate with ML researchers to productionize models for real-time and batch inference.
- Design and implement robust deployment pipelines using Kubernetes, Docker, and cloud services (AWS/Azure/GCP).
- Ensure reliability, observability, and performance of inference systems with monitoring, tracing, and alerting.
- Mentor and guide other engineers, fostering best practices in software engineering and ML tooling.
- Contribute to writing clean code, documentation, and incident postmortems.
Requirements
- 5+ years of software engineering experience with a track record building production-grade systems.
- Strong Python development skills; experience with ML model serving frameworks (TorchServe, TensorFlow Serving, or ONNX Runtime).
- Experience deploying and scaling ML models in production, including containerization (Docker) and orchestration (Kubernetes).
- Familiarity with REST/gRPC APIs, microservices architecture, and API design.
- Experience with cloud platforms (AWS, GCP, or Azure) and CI/CD pipelines.
- Strong problem-solving skills, collaboration, and ability to communicate effectively with cross-functional teams.
Nice to have
- Experience with large language models and prompt-tuning workflows.
- Rust, Go, or C++ experience; performance optimization for low-latency inference.
- Experience with ML monitoring and model governance tools.
About Speechify
Speechify builds AI-powered reading assistance to help people consume content more efficiently. You will join a collaborative team focused on delivering high-quality software that scales globally.