Related skills
llm ocr pinecone milvus weaviate๐ Description
- Lead pipelines for unstructured data, including parsing, OCR, and normalization
- Own the indexing strategy beyond SQL, optimized for LLMs
- Design semantic chunking, embeddings, and hybrid search across Vector and Graph architectures
- Establish metadata extraction protocols for high-fidelity knowledge capture
- Define data ingestion standards and tooling for batch and streaming
- Lead and scale a team of data and platform engineers
๐ฏ Requirements
- 10+ years in data-heavy engineering or platform environments
- 2+ years leading platform or data engineering teams
- Vector databases Pinecone, Milvus, Weaviate
- Structure data for LLMs including chunking, enrichment, embedding
- Strong SQL, ETL/ELT, and data modeling
- Parsing, OCR, and unstructured data pipelines
- Cloud-native architectures Containers, IaC
๐ Benefits
- 100% Remote Work
- WFH allowance for remote work
- Career growth with 360ยฐ feedback
- Training time for courses and events
- Mentoring program for mentoring or being mentored
- Wellbeing hub with mental health and wellness resources
- Multicultural environment with events and celebrations
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Data Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!