This job is no longer available

The job listing you are looking has expired.
Please browse our latest remote jobs.

See open jobs →
← Back to all jobs

Senior Software Engineer, AI Model Serving - Kathmandu, Nepal

Added
21 days ago
Location
Type
Full time
Salary
Not Specified

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Save job

Senior Software Engineer, AI Model Serving - Kathmandu, Nepal

Speechify is looking for a Senior Software Engineer to design, build, and operate scalable AI model serving infrastructure for production-grade NLP/ML workloads. This on-site role is based in Kathmandu, Nepal, and you’ll collaborate with ML researchers and software engineers to deploy, optimize, and maintain model serving pipelines powering Speechify’s AI features.

About Speechify

Speechify is a leading AI-powered text-to-speech platform that helps people listen to content with ease. We are focused on delivering fast, reliable, and scalable systems that bring advanced AI capabilities to users worldwide.

Responsibilities

  • Design, implement, and scale high-performance AI model serving systems for real-time and batch inference.
  • Develop APIs and microservices to expose model predictions to frontend apps and external partners.
  • Collaborate with ML scientists to optimize models for latency, throughput, and resource usage (quantization, distillation, etc.).
  • Implement CI/CD pipelines for ML workflows, containerize services with Docker, and deploy to Kubernetes clusters.
  • Monitor service reliability and observability (metrics, traces, logs) and establish robust alerting.
  • Optimize resource usage across cloud and on-prem environments; implement batching, caching, and autoscaling strategies.
  • Mentor junior engineers and contribute to architectural decisions across the AI platform.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • 5+ years of software engineering experience with hands-on ML model serving.
  • Proficiency in Python; experience with ML frameworks (TensorFlow, PyTorch) and model serving tools (TensorFlow Serving, TorchServe, Triton Inference Server).
  • Experience with Kubernetes, Docker, and cloud platforms (AWS, Google Cloud Platform, Azure).
  • Strong knowledge of APIs, distributed systems, and microservices architectures.
  • Experience with monitoring/logging stacks (Prometheus, Grafana, OpenTelemetry).
  • Excellent communication and collaboration skills; ability to work with cross-functional teams.

Nice-to-have

  • Experience with large language models and GPU-accelerated workloads.
  • Familiarity with CI/CD for ML, feature stores, or data pipelines.

We offer a competitive compensation package and a comprehensive benefits package, along with opportunities for growth.

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to On site Engineering Jobs. Just set your preferences and Job Copilot will do the rest—finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs →