Added
less than a minute ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

python pytorch transformers lora triton

πŸ“‹ Description

  • Own productionization of Invoca's ML stack: CI/CD for models and APIs.
  • Design and optimize SLM/LLM deployment: inference on Triton, Baseten, and Kubernetes GPU.
  • Build robust APIs for internal and external model access.
  • Collaborate with Data Scientists, Data Engineers, and Applied AI Engineers.
  • Drive reliability, performance, and scalability of ML infrastructure.

🎯 Requirements

  • 5+ years ML engineering with production focus
  • Advanced Python and DL: PyTorch, Transformers, spaCy
  • Production deployment of transformer NLP models
  • Fine-tuning SLMs/LLMs: LoRA, QLoRA, PEFT; quantization
  • Inference infra: Triton, Baseten, vLLM, TGI; APIs
  • ML Ops tooling and model monitoring; B.S. in CS/Eng/Stats; advanced degree a plus

🎁 Benefits

  • Flexible Time Off
  • 16 U.S. paid holidays
  • Health, dental, vision; fertility assistance
  • 401(k) with company match up to 4%
  • Stock options; mental health program; paid family leave
  • Salary range: 152k-228k USD; plus bonus/equity
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’