Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Set your preferences and let your AI copilot handle the job search while you sleep.

Applies for jobs that actually match your skills

Tailors your resume and cover letter automatically

Works 24/7—so you don't have to

Senior Software Engineer, AI Model Serving - Kathmandu, Nepal

Speechify is looking for a Senior Software Engineer to design, build, and operate scalable AI model serving infrastructure for production-grade NLP/ML workloads. This on-site role is based in Kathmandu, Nepal, and you’ll collaborate with ML researchers and software engineers to deploy, optimize, and maintain model serving pipelines powering Speechify’s AI features.

About Speechify

Speechify is a leading AI-powered text-to-speech platform that helps people listen to content with ease. We are focused on delivering fast, reliable, and scalable systems that bring advanced AI capabilities to users worldwide.

Responsibilities

Design, implement, and scale high-performance AI model serving systems for real-time and batch inference.
Develop APIs and microservices to expose model predictions to frontend apps and external partners.
Collaborate with ML scientists to optimize models for latency, throughput, and resource usage (quantization, distillation, etc.).
Implement CI/CD pipelines for ML workflows, containerize services with Docker, and deploy to Kubernetes clusters.
Monitor service reliability and observability (metrics, traces, logs) and establish robust alerting.
Optimize resource usage across cloud and on-prem environments; implement batching, caching, and autoscaling strategies.
Mentor junior engineers and contribute to architectural decisions across the AI platform.

Requirements

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
5+ years of software engineering experience with hands-on ML model serving.
Proficiency in Python; experience with ML frameworks (TensorFlow, PyTorch) and model serving tools (TensorFlow Serving, TorchServe, Triton Inference Server).
Experience with Kubernetes, Docker, and cloud platforms (AWS, Google Cloud Platform, Azure).
Strong knowledge of APIs, distributed systems, and microservices architectures.
Experience with monitoring/logging stacks (Prometheus, Grafana, OpenTelemetry).
Excellent communication and collaboration skills; ability to work with cross-functional teams.

Nice-to-have

Experience with large language models and GPU-accelerated workloads.
Familiarity with CI/CD for ML, feature stores, or data pipelines.

We offer a competitive compensation package and a comprehensive benefits package, along with opportunities for growth.

Speechify

Tired of Manually Applying to Jobs?

Let JobCopilot do it for you!

Senior Software Engineer, AI Model Serving - Kathmandu, Nepal

Senior Software Engineer, AI Model Serving - Kathmandu, Nepal

About Speechify

Responsibilities

Requirements

Nice-to-have

Meet JobCopilot: Your Personal AI Job Hunter

Related Engineering Jobs