Related skills
docker prometheus python kubernetes nvidia tritonπ Description
- Deploy and optimize model-serving platforms (Triton/vLLM/KServe) in Rackspace Private Cloud/Hybrid.
- Bridge AI engineering and platform ops for secure, scalable inference services.
- Deploy models in Kubernetes; tune latency and throughput to meet SLAs.
- Integrate models with Rackspace Unified Inference API and API Gateway for multi-tenant routing.
- Enable Observability and FinOps for GPU utilization and cost reporting.
- Support customers and onboarding for BFSI/Healthcare and other verticals.
π― Requirements
- Hands-on with NVIDIA Triton, vLLM, or similar serving stacks.
- Strong Kubernetes, GPU scheduling, and CUDA/MIG knowledge.
- Familiarity with VMware VCF9, NSX-T, and vSAN.
- Proficiency in Python and Docker.
- Observability stacks (Prometheus, Grafana) and FinOps.
- Customer-facing communication skills.
π Benefits
- NVIDIA Certified Professional (AI/ML) preferred.
- Kubernetes Administrator (CKA) preferred.
- VMware VCF Specialist preferred.
- Rackspace AI Foundations (internal) preferred.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!