Overview
Speechify is seeking a Senior Software Engineer, AI Model Serving based in Bucharest, Romania. You will design, build, and scale the AI model serving infrastructure that powers real-time inference and production-grade ML applications.
Responsibilities
- Design, implement, and maintain scalable AI model serving services
- Optimize inference latency and throughput for production models
- Build robust APIs and integrate with deployment pipelines
- Collaborate with ML researchers to operationalize models
- Develop monitoring, logging, security, and reliability practices
- Participate in code reviews and contribute to architectural decisions
- Mentor junior engineers and help grow the team
Qualifications
- 5+ years of software engineering experience
- Strong proficiency in Python and at least one systems language (Go/C/C++)
- Experience with ML model serving frameworks (TensorFlow Serving, TorchServe, Triton)
- Hands-on experience with Docker, Kubernetes, and cloud platforms (AWS, GCP, Azure)
- Experience building APIs and distributed systems
- Bachelor's or Master's degree in Computer Science or a related field
Nice to have
- Experience with large language models and optimization
- Familiarity with ONNX, MLIR, or related tooling
Benefits
Competitive salary and benefits, opportunity to work on cutting-edge AI applications, and professional growth in a dynamic team.