Own productionization of Invoca's ML stack: CI/CD for models and APIs.
Design and optimize SLM/LLM deployment: inference on Triton, Baseten, and Kubernetes GPU.
Build robust APIs for internal and external model access.
Collaborate with Data Scientists, Data Engineers, and Applied AI Engineers.
Drive reliability, performance, and scalability of ML infrastructure.

🎯 Requirements

5+ years ML engineering with production focus
Advanced Python and DL: PyTorch, Transformers, spaCy
Production deployment of transformer NLP models
Fine-tuning SLMs/LLMs: LoRA, QLoRA, PEFT; quantization
Inference infra: Triton, Baseten, vLLM, TGI; APIs
ML Ops tooling and model monitoring; B.S. in CS/Eng/Stats; advanced degree a plus

🎁 Benefits

Flexible Time Off
16 U.S. paid holidays
Health, dental, vision; fertility assistance
401(k) with company match up to 4%
Stock options; mental health program; paid family leave
Salary range: 152k-228k USD; plus bonus/equity

Apply on employer's website

This employer gathers applications via their own applicant tracking system.

You will be redirected to an external application form.

Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest — finding, filtering, and applying while you focus on what matters.

Activate JobCopilot