Added
28 days ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

python distributed systems ci/cd mlops llms

πŸ“‹ Description

  • Design scalable, low-latency model-serving infra for LLMs.
  • Build APIs and services for real-time conversational workloads.
  • Optimize inference for throughput, latency, and cost.
  • Architect end-to-end ML pipelines from training to deployment.
  • Lead architecture decisions for production AI at scale.

🎯 Requirements

  • 1-4 years in ML engineering, backend systems, or distributed infra.
  • Proven experience deploying ML models in production.
  • Strong Python and/or C++ programming skills.
  • Experience with large-scale model serving (LLMs, transformers).
  • Deep understanding of distributed systems, API design, and cloud infra.
  • Experience with MLOps tools and CI/CD, monitoring, and experiment tracking.

🎁 Benefits

  • Diverse medical, dental and vision options.
  • 401(k) matching program.
  • Unlimited paid time off.
  • Parental leave and flexibility for all parents and caregivers.
  • Support of country-specific visa needs for international employees in the Bay Area.
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’