Added
less than a minute ago
Location
Type
Full time
Salary
Upgrade to Premium to se...
Related skills
python pytorch transformers lora tritonπ Description
- Own productionization of Invoca's ML stack: CI/CD for models and APIs.
- Design and optimize SLM/LLM deployment: inference on Triton, Baseten, and Kubernetes GPU.
- Build robust APIs for internal and external model access.
- Collaborate with Data Scientists, Data Engineers, and Applied AI Engineers.
- Drive reliability, performance, and scalability of ML infrastructure.
π― Requirements
- 5+ years ML engineering with production focus
- Advanced Python and DL: PyTorch, Transformers, spaCy
- Production deployment of transformer NLP models
- Fine-tuning SLMs/LLMs: LoRA, QLoRA, PEFT; quantization
- Inference infra: Triton, Baseten, vLLM, TGI; APIs
- ML Ops tooling and model monitoring; B.S. in CS/Eng/Stats; advanced degree a plus
π Benefits
- Flexible Time Off
- 16 U.S. paid holidays
- Health, dental, vision; fertility assistance
- 401(k) with company match up to 4%
- Stock options; mental health program; paid family leave
- Salary range: 152k-228k USD; plus bonus/equity
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest β finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!