Senior Software Engineer, AI Model Serving — Nagoya, Japan
Speechify is seeking a Senior Software Engineer to join our AI Model Serving team in Nagoya, Japan. You will design and implement scalable inference infrastructure that powers Speechify's AI capabilities, ensuring low latency, reliability, and high throughput in production.
Responsibilities
- Design, implement, test, and optimize AI model serving systems and APIs.
- Collaborate with ML researchers to productionize models and monitoring tools.
- Build scalable inference pipelines using containerized microservices and orchestration (Docker, Kubernetes).
- Ensure observability through metrics, tracing, logging, and alerting; maintain SLAs.
- Mentor junior engineers and contribute to code reviews and architecture decisions.
- Participate in onboarding, security, and compliance practices in line with company standards.
Requirements
- 5+ years of software engineering experience with a focus on ML model serving or large-scale infrastructure.
- Strong programming skills (Python preferred; others such as Go/Java acceptable).
- Experience with ML serving frameworks (TensorFlow Serving, TorchServe) and/or custom serving solutions.
- Hands-on experience with containers (Docker) and orchestration (Kubernetes).
- Familiarity with cloud platforms (AWS, GCP, Azure) and CI/CD pipelines.
- Good communication skills and ability to work in a cross-functional team.
Nice to have
- Experience with Rust, C++, or low-latency systems.
- Knowledge of vector databases, model versioning, or feature stores.
Location: Nagoya, Aichi Prefecture, Japan. This role is on-site with potential for hybrid arrangements; English and Japanese language skills are a plus.