Related skills
sql python airflow spark presto๐ Description
- Build scalable, fault-tolerant distributed data processing systems (batch & streaming)
- Create quality data solutions; refine diverse datasets into simple, self-service models
- Design data pipelines with strong data quality and resilience to bad data
- Own data mappings, transformations, and data quality
- Debug low-level systems; measure performance on large production clusters
- Participate in architecture discussions; influence roadmap and own projects
๐ฏ Requirements
- Extensive SQL skills
- Python scripting proficiency
- Experience with HDFS/YARN/MapReduce/Hive/Kafka/Spark/Airflow/Presto
- Data modeling expertise for scalable architectures
- AWS and/or GCP experience; Looker a plus
- 8+ years of data engineering; BS in CS; MS preferred
๐ Benefits
- Global mental health and financial wellness resources
- Healthcare, life, disability, and retirement options
- Generous vacation and personal time off
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Data Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!