AI Model Serving Specialist

Added
less than a minute ago
Type
Full time
Salary
Salary not provided

Related skills

docker prometheus python kubernetes nvidia triton

πŸ“‹ Description

  • Deploy and optimize model-serving platforms (Triton/vLLM/KServe) in Rackspace Private Cloud/Hybrid.
  • Bridge AI engineering and platform ops for secure, scalable inference services.
  • Deploy models in Kubernetes; tune latency and throughput to meet SLAs.
  • Integrate models with Rackspace Unified Inference API and API Gateway for multi-tenant routing.
  • Enable Observability and FinOps for GPU utilization and cost reporting.
  • Support customers and onboarding for BFSI/Healthcare and other verticals.

🎯 Requirements

  • Hands-on with NVIDIA Triton, vLLM, or similar serving stacks.
  • Strong Kubernetes, GPU scheduling, and CUDA/MIG knowledge.
  • Familiarity with VMware VCF9, NSX-T, and vSAN.
  • Proficiency in Python and Docker.
  • Observability stacks (Prometheus, Grafana) and FinOps.
  • Customer-facing communication skills.

🎁 Benefits

  • NVIDIA Certified Professional (AI/ML) preferred.
  • Kubernetes Administrator (CKA) preferred.
  • VMware VCF Specialist preferred.
  • Rackspace AI Foundations (internal) preferred.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’