AI/ML Engineer - Web Data Quality - Remote
AI/ML Engineer - Web Data Quality - Remote
1 month ago Be among the first 25 applicants
About Us
At Zyte, we eat data for breakfast and you can work from anywhere. Founded in 2010, we are a globally distributed team of over 250 Zytans across 28 countries, on a mission to enable our customers to extract the data they need to innovate and grow. We believe all businesses deserve a smooth pathway to data and lead the way in building powerful, easy‑to‑use tools to collect, format, and deliver web data quickly, dependably, and at scale.
Roles & Responsibilities
- Design and implement AI‑driven quality checks: build models to detect anomalies, identify schema drift, and classify data errors in real time.
- Automate and scale QA: replace manual and rule‑based validation with ML‑powered solutions that continuously improve.
- Leverage Gen
AI for validation: use embedding models, LLMs, and prompt‑driven pipelines to perform semantic checks on scraped data. - Develop monitoring & alerting pipelines: quantify data quality via KPIs, dashboards, and automated reports for stakeholders.
- Experiment & innovate: research and prototype new AI techniques for QA, e. g. using embeddings, synthetic data, and reinforcement learning to stress‑test scrapers.
- Collaborate cross‑functionally: work with developers, product managers, and account teams to integrate AI‑based QA into production workflows.
- Communicate insights: present findings with clear visualizations, metrics, and evidence‑based recommendations to technical and non‑technical audiences.
Requirements
- Proficiency in Python & Py
Data stack (Num
Py, pandas, scikit‑learn, Py
Torch/Tensor
Flow preferred). - 3+ years in a data science, applied ML, or data engineering role (ideally with exposure to QA or data validation at scale).
- Hands‑on experience with Gen
AI tools: LLM APIs (Open
AI, Anthropic, Google), prompt engineering, cost/token optimization. - Strong ML fundamentals: anomaly detection, classification, clustering, embeddings, evaluation metrics.
- Experience with big data frameworks (Spark, Big
Query, or similar). - Ability to work with very large datasets (millions+ of records).
- Version control skills (Git
Hub/Bitbucket). - Excellent communication in English, both technical and non‑technical.
Desired Skills
- Prior experience in data quality automation or web data QA.
- Familiarity with Lang
Chain, MCP, Marvin, or similar orchestration frameworks. - Experience building QA dashboards or visualization layers.
- Background in statistics or applied mathematics.
- Previous remote/distributed work experience.
Benefits
As a new Zytan, you will:
- Become part of a self‑motivated, progressive, multi‑cultural team.
- Have the freedom and flexibility to work from where you do your best work.
- Attend conferences and meet team members from across the globe.
- Work with cutting‑edge open source technologies and tools.
Seniority level
Mid‑Senior level
Employment type
Full‑time
Job function
Quality Assurance
Industries
IT Services and IT Consulting
- Informações detalhadas sobre a oferta de emprego
Empresa: Zyte Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 7. 11. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!