Hardware Engineer, GPU & PCIe

Added
24 hours ago
Type
Full time
Salary
Upgrade to Premium to se...

Related skills

ansible grafana prometheus python gpu

πŸ“‹ Description

  • Troubleshoot complex GPU and PCIe related failures
  • Partner with external vendors on failure analysis
  • Track component RMAs
  • Develop and maintain hardware/firmware management services
  • Automate server hardware lifecycle processes
  • Serve as senior hardware escalation contact

🎯 Requirements

  • 2+ years experience supporting/troubleshooting data center GPUs (H100+; Infiniband/NVLink)
  • Proficiency in Ansible and Python; IPMI/Redfish (Redfish preferred)
  • Experience with GPU diagnostics and observability tools (Prometheus, Grafana)
  • In-depth knowledge of server hardware, GPUs and PCIe devices
  • Experience collaborating with hardware vendors to create playbooks and drive resolution
  • Excellent documentation and attention to detail

🎁 Benefits

  • Medical, dental, and vision insurance - 100% paid by CoreWeave
  • Company-paid Life Insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
Share job

Meet JobCopilot: Your Personal AI Job Hunter

Automatically Apply to Engineering Jobs. Just set your preferences and Job Copilot will do the rest β€” finding, filtering, and applying while you focus on what matters.

Related Engineering Jobs

See more Engineering jobs β†’