Related skills
aws sql python airflow spark๐ Description
- Design, build, and maintain ETL/ELT pipelines with Python and Spark
- Ensure data quality, reliability, and timeliness
- Model and store data with Parquet, JSON, Avro; Iceberg/Hive formats
- Deploy pipelines on cloud infra (AWS, some GCP)
- Partner with Analysts and Data Scientists to deliver datasets
- Work with Security and Compliance to ensure proper permissions
๐ฏ Requirements
- 4+ years building ETL pipelines for Data Lake/Warehouse
- Python, Spark, SQL, Airflow expertise
- Data warehousing, modeling, quality validation, monitoring
- Hive, Iceberg, Glue or similar big data tables
- Parquet, Avro, JSON file formats familiarity
- Cloud platforms: AWS, GCP, Azure
- Exposure to real-time/streaming data is a plus
๐ Benefits
- Employee affinity groups and wellbeing support
- Fertility assistance and parental leave policy
- Learn more about working at MongoDB
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Engineering Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!