Related skills
json etl data pipelines web scraping data quality๐ Description
- Monitor daily health checks on ETL/Parser pipelines and alert faulty feeds.
- Investigate data anomalies by comparing system data against live data; inspect HTML/JSON.
- Develop, fix & optimize: fix extraction logic; write/edit patterns; parse embedded JSON (Jackson JQ).
- Ad-Hoc Feature Development: add new data fields; manage dev-to-prod lifecycle.
- Lead Domain Discovery: onboard new websites from scratch to release.
- Test & Release: QA workflow; validate changes against large datasets.
๐ฏ Requirements
- Proven experience in data quality, data operations, or web scraping.
- Hands-on extraction using CSS selectors and RegEx.
- Proficiency with JQ for parsing embedded JSON.
- Focus on performance and maintainable patterns that scale.
- Strong analytical and problem-solving skills with attention to detail.
- Ability to work through complex workflows and QA processes.
- Experience with Jira and clear technical documentation.
๐ Benefits
- Office hubs worldwide; team in hub offices 3x/week.
- Flexible vacation time.
- Great learning and development opportunities.
- Parental leave and benefits.
- Volunteering opportunities.
- ERGs for inclusion and belonging.
Meet JobCopilot: Your Personal AI Job Hunter
Automatically Apply to Data Jobs. Just set your
preferences and Job Copilot will do the rest โ finding, filtering, and applying while you focus on what matters.
Help us maintain the quality of jobs posted on Empllo!
Is this position not a remote job?
Let us know!