Related skills
sql python kafka spark hive๐ Description
- Build scalable, fault-tolerant distributed data processing (batch/streaming).
- Create quality data solutions; refine datasets into simple self-service models.
- Build data pipelines that ensure data quality and resilience to dirty sources.
- Own data mapping, business logic, transformations and data quality.
- Debug low-level systems; measure performance and optimize large production clusters.
- Participate in architecture discussions; influence roadmap and own new projects.
๐ฏ Requirements
- Extensive SQL skills
- Python scripting proficiency
- Experience with HDFS, YARN, Map-Reduce, Hive, Kafka, Spark
- Data modeling across conceptual/logical/physical models
- AWS and/or GCP experience; Looker a plus
- Cross-functional collaboration with developers, analysts, and operations
- 8+ years experience as a data engineer
- BS in Computer Science; MS preferred
๐ Benefits
- Global mental health and financial wellness resources
- Healthcare, life, disability, and retirement options
- Vacation and personal time off policies
- Inclusive, accommodations-focused culture
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Data Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!