Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
terraform python kubernetes go llamaindexπ Description
- Embedded with AI-native customers to manage migrations and PoCs on GPU infra
- Build production-ready assets, tooling, and automation for AI workloads
- Speed time-to-inference by scaling AI workloads in production
- Lead field-enabled tools to inform product roadmap and deployment
- Validate early AI frameworks for a top developer experience
- Partner with Strategic/Technical teams to enable repeatable deployments
π― Requirements
- AI/ML architecture for hosting large models with inference engines (vLLM, SGLang)
- Distributed systems mastery; GPUs, CUDA/ROCm, Kubernetes, and IaC
- Hands-on with distributed inference (llm-d, Ray Serve) and GPU optimization (NVLink, RoCE)
- Production-grade Python or Go for tools and automation
- Data-driven benchmarking and GPU utilization tuning for ROI
- Consultative execution with CTOs and lead architects during migrations
- Experience with LangGraph, CrewAI, or LlamaIndex
π Benefits
- Career development resources and learning opportunities
- Flexible time off and comprehensive benefits
- Equity grants and future stock purchase options
- Remote-friendly culture with global onboarding
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!