This job is no longer available

The job listing you are looking has expired.
Please browse our latest remote jobs.

See open jobs →
Added
less than a minute ago
Location
Type
Full time
Salary
Not Specified

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Save job

Responsibilities:

Infrastructure Development & Integration

  • Design, implement, and manage cloud-native infrastructure (AWS, Azure, GCP) to support healthcare platforms, AI agents, and clinical applications.
  • Build and maintain scalable CI/CD pipelines to enable rapid and reliable delivery of software,data pipelines, and AI/ML models.
  • Design and manageKubernetes (K8s) clustersfor container orchestration, workload scaling, and high availabilitywith integrated monitoring to ensure cluster health and performance
  • Implement Kubernetes-native tools (Helm, Kustomize, ArgoCD) for deployment automation and environment managementensuring observability through monitoring dashboards and alerts
  • Collaborate with Staff Engineers/Architects to align infrastructure with enterprise goals for scalability, reliability, and performanceleveraging monitoring insights to inform architectural decisions.

System Optimization & Reliability

  • Implement and maintain comprehensive monitoring, logging, and alerting mechanisms (Prometheus, Grafana, ELK, Datadog, AWS cloudwatch, AWS cloud trail) to ensure real-time visibility into system performance, resource utilization, and potential incidents.
  • Implement monitoring, logging, and alerting mechanisms (Prometheus, Grafana, ELK, Datadog) to ensure system reliability and proactive incident response.
  • Ensuredata pipeline workflows (ETL/ELT, real-time streaming, batch processing)are observable, reliable, and auditable.
  • Support observability and monitoring ofGenAI pipelines, embeddings, vector databases, and agentic AI workflows.
  • Proactively analyze monitoring data to identify bottlenecks, predict failures, and drive continuous improvement in system reliability.

Compliance & Security

  • Support audit trails and compliance reporting through automated DevSecOps practices.
  • Implement security controls forLLM-based applications, AI agents, and healthcare data pipelines, including prompt injection prevention, API rate limiting, and data governance.

Collaboration & Agile Practices

  • Partner closely with software engineers, data engineers, AI/ML engineers, and product managers to deliver integrated, secure, and scalable solutions.
  • Contribute to agile development processes including sprint planning, stand-ups, and retrospectives.
  • Mentor junior engineers and share best practices in cloud-native infrastructure, CI/CD, Kubernetes, and automation.

Innovation & Technical Expertise

  • Stay informed about emerging DevOps practices, cloud-native architectures,MLOps/LLMOps, and data engineering tools.
  • Prototype and evaluate new frameworks and tools to enhance infrastructure fordata pipelines, GenAI, and Agentic AI applications.
  • Advocate for best practices in infrastructure design, focusing on modularity, maintainability, and scalability.

Requirements

Education & Experience

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related technical discipline.
  • 9+ years of experience in DevOps, Site Reliability Engineering, or related roles, with at least 3+ years building cloud-native infrastructure.
  • Proven track record of managingproduction-grade Kubernetes clustersand cloud infrastructure in regulated environments.
  • Experience supportingGenAI/LLM applications(e.g., OpenAI, Hugging Face, LangChain) andvector databases(e.g., Pinecone, Weaviate, FAISS).
  • Hands-on experience supportingdata pipeline productsusing ETL/ELT frameworks (Apache Airflow, dbt, Prefect) andstreaming systems(Kafka, Spark, Flink).
  • Experience deployingAI agents and orchestrating agent workflowsin production environments.

Technical Proficiency

  • Expertise in Kubernetes (K8s)for orchestration, scaling, and managing containerized applications.
  • Strong proficiency in containerization (Docker) and Kubernetes ecosystem tools (Helm, ArgoCD, Istio/Linkerd for service mesh).
  • Hands-on experience with Infrastructure as Code (Terraform, CloudFormation, or Pulumi).
  • Proficiency with CI/CD tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD, Spinnaker).
  • Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK, Datadog, AWS cloud watch and AWS cloud trail), including setting up dashboards, alerts, and custom metrics for cloud-native and AI systems.
  • Good to have: knowledge of healthcare data standards (FHIR, HL7) and secure deployment practices for AI/ML and data pipelines.

Professional Skills

  • Strong problem-solving skills with a focus on reliability, scalability, and security.
  • Excellent collaboration and communication skills across cross-functional teams.
  • Proactive, detail-oriented, and committed to technical excellence in a fast-paced healthcare environment.

About Get Well:

Now part of theSAI Group family, Get Well is redefining digital patient engagement by putting patients in control of their personalized healthcare journeys, both inside and outside the hospital. Get Well is combining high-tech AI navigation with high-touch care experiences driving patient activation, loyalty, and outcomes while reducing the cost of care. For almost 25 years, Get Well has served more than 10 million patients per year across over 1,000 hospitals and clinical partner sites, working to use longitudinal data analytics to better serve patients and clinicians. AI innovator SAI Group led by Chairman Romesh Wadhwani is the lead growth investor in Get Well. Get Well’s award-winning solutions were recognized again in 2024 by KLAS Research and AVIA Marketplace. Learn more at Get Well and follow-us on LinkedIn and Twitter.

Get Well is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age or veteran status.

About SAI Group:

Use AI to Automatically Apply!

Let your AI Job Copilot auto-fill application questions
Auto-apply to relevant jobs from 300,000 companies

Auto-apply with JobCopilot Apply manually instead
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to On site DevOps Jobs. Just set your preferences and Job Copilot will do the rest—finding, filtering, and applying while you focus on what matters.

Related DevOps Jobs

See more DevOps jobs →