For job seekers
For companies
Set your preferences and let your AI copilot handle the job search while you sleep.
Mission
We’re reimagining instant shopping so technology never stands in the way, it accelerates you exponentially toward your goal by forming a deep connection with your needs and desires. In the Personal Superintelligence Lab, you will lead the design and deployment of agentic AI that reasons over rich, real‑world context and constraints, grounded in up‑to‑the‑minute knowledge and leveraging our unparalleled delivery speed. Your work will push the state of the art in alignment, grounding, and multi‑agent orchestration—while landing breakthroughs safely and at scale in production.
Scope and Impact
As an AI Engineer III, you will be the technical lead across context engineering, RLHF/RLVR and low‑latency serving. You’ll define the architecture, standards, and evaluation strategy that connect research to real‑world lift. You’ll mentor colleagues, influence cross‑functional roadmaps, and ship systems that deliver measurable improvements to core customer and business outcomes—without disclosing competitive intelligence.
Areas of Leadership and Contribution
Advanced Context & Grounding Research:
- Set the strategy for context engineering to maximize precision/recall of key order metrics across sessions, households, locales, and time.
- Architect multi‑modal context integration (temporal, spatial, behavioral) and real‑time grounding with dynamic constraint satisfaction.
- Establish retrieval freshness, geo/time‑aware constraints, and memory policies; formalize context schemas and data contracts.
- Champion declarative prompt/program compilation (e.g., DSPy) for systematic, testable LLM behavior.
- Design multi‑agent orchestration patterns (e.g., graph‑based agents via LangChain/LangGraph, CrewAI, AutoGen, LlamaIndex) that yield robust emergent reasoning.
Alignment and learning Systems:
- Lead supervised reasoning-centered fine‑tuning with rigorous data curation, synthetic data generation, and QA; institute golden sets and rubric/pairwise evals.
- Own the reasoning architecture and evaluation strategy—planning, tool selection, reflection, and uncertainty-aware decision-making—to deliver robust, low-latency, grounded outcomes at scale.
- Drive parameter‑efficient adaptation strategies (LoRA/QLoRA and text-to-LoRA) with clear criteria for when to specialize vs. generalize.
- Architect RLHF and RLVR pipelines; build preference data loops, scalable oversight, and guardrails.
- Own policy optimization strategy: expert use of DPO/PPO/GRPO/GSPO and advancement beyond them (constrained optimization, regularized objectives, KL‑control) with formal safety considerations.
- Ensure robust offline‑to‑online correlation via counterfactual/IPS/DR estimators and stress tests across traffic segments.
Safety, robustness, and privacy:
- Establish interpretability, controllability, and alignment verification practices for agentic systems.
- Develop safeguards against reward hacking and unsafe exploration; enforce distributional robustness and content policy compliance.
- Advance privacy‑preserving methods (data minimization, federated/on‑device learning where appropriate) with privacy‑by‑design.
Systems, serving, and evaluation at scale:
- Architect low‑latency, cost‑efficient inference (quantization, caching, batching, streaming) with resilient fallbacks and red‑teaming.
- Build eval frameworks that tightly couple offline metrics with online performance and safety criteria; define promotion gates.
- Use relevant APIs to perform high‑fidelity data augmentation that strengthens grounding, disambiguation, and availability‑aware suggestions.
Experimentation and cross‑functional impact:
- Partner closely with Engineering and Data Science to design experiments, define success criteria, and iterate quickly from signal to lift.
- Translate ambiguous product goals into crisp technical milestones; maintain clear documentation, incident response, and learning playbooks.
- Mentor colleagues; raise the bar on design quality, reproducibility, and ethical rigor.
Requirements:The only predictable thing about life is that it’s wildly unpredictable.That’s where we come in.
When life does what it does best, customers turn to Gopuff to deliver their everyday essentials, and to get through their day & night, work day and weekend.
We’re assembling a team of thinkers, dreamers & risk-takers...the kind of people who know the value of peace of mind in an unpredictable world. (And people who love snacks.)
Like what you’re hearing? Welcome to Gopuff.
The Gopuff Fam is committed to an inclusive workplace where we do not discriminate on the basis of race, sex, gender, national origin, religion, sexual orientation, gender identity, marital or familial status, age, ancestry, disability, genetic information, or any other characteristic protected by applicable laws. We believe in diversity and encourage any qualified individual to apply. We are an equal employment opportunity employer.
#LI-GOPUFF
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!