Core & Ml Ops Team Lead - Remote
2 days ago Be among the first 25 applicants
Get AI-powered advice on this job and more exclusive features. About Us
At Zyte, we eat data for breakfast and you can eat your breakfast anywhere and work for Zyte. Founded in 2010, we are a globally distributed team of over 250 Zytans working from over 28 countries who are on a mission to enable our customers to extract the data they need to continue to innovate and grow their businesses. We believe that all businesses deserve a smooth pathway to data. Team Lead – Core & MLOps Squad
Zyte is seeking an experienced Team Lead to manage our Core & MLOps Squad, responsible for "Building the bedrock infrastructure that powers Zyte at scale. " This hands‑on technical leadership role requires expertise across MLOps, systems programming, and orchestration to lead a cross‑functional team in designing and maintaining the scalable foundation that enables all Zyte teams to build and run their services with confidence. Technical Leadership- Design and evolve the core platform (Kubernetes, Mesos, GPU scheduling/autoscaling, distributed compute)- Own the model platform: registry, experiment tracking, training orchestration, evaluation, serving, and
- Build the Golden Path: reference repos, a scaffold CLI, opinionated CI/CD pipelines, runtime contracts (health/metrics/tracing/SLOs),
- performance clients, circuit breakers and other production‑ready defaults
MLOps Excellence- Operate a secure, multi‑tenant model registry and training platform with standardized experiment/evaluation
- Provide turnkey serving patterns (online + batch), drift/quality monitoring, and rollback
- Integrate public/open‑source AI capabilities as managed platform services with cost and data‑governance guardrails
Team Management- Run the squad: roadmap/prioritization, delivery, mentoring, and high engineering
- Partner with product engineering (Zyte API, Scrapy Cloud), Prod Ops, and Security on adoption and rollout
- Mentor the team and foster a
- thinking mindset
Ownership Areas- Container orchestration (Kubernetes/Knative), GPU provisioning & autoscaling, environment & secret
- Operators, sidecars, and internal SDKs/libraries (Go/Rust/Python/Java) that enforce the golden path
- Model platform: registry, experiment tracking, training orchestration, evaluation framework, serving infra, model
- Observability: logging/metrics/tracing
- Billing pipeline: metering/events/cost tracking
- Golden Path: Java, Python, ML templates + CI/CD blueprints + docs + scaffold CLI- Reliability enablement (SRE practices), cost governance, supply‑chain security (SBOM, image signing)Qualifications
Required- 5+ years experience building distributed systems;
3+ years in MLOps/ML platform engineering (or equivalent impact)- Knowledge of Linux/OS internals (process model, cgroups/namespaces), networking (TCP/IP, HTTP/2), concurrency, and performance
- Deep understanding of Kubernetes (bonus: Mesos)- Proficiency developing
- performance services in Java, Rust, Go or C++ (bonus: familiarity with vert. X and Netty frameworks);
strong Python
- Experience with GPU infrastructure (scheduling, containerization, optimization)- Track record of designing and operating model platforms (registry, training, serving, monitoring) in
- Demonstrated success leading technical teams and implementing
- wide platform solutions
Preferred- Streaming & workflows: Kafka plus Argo/Temporal/Airflow or
- e
BPF‑based observability, perf tooling, or io_uring
- Cost optimization for ML/AI;
multi‑tenant quotas
- Hands‑on experience authoring Golden Paths (service chassis/templates, CI/CD blueprints, CLI scaffolds)- SRE practices (SLIs/SLOs, incident management)Benefits- We love fostering and nourishing new ideas and bringing them to
- Become part of a self‑motivated, progressive, multi‑cultural
- Have the freedom and flexibility to work from where you do your best work, as we are a completely remote
- Get the chance to work with cutting‑edge open‑source technologies and tools
Seniority level
Mid‑Senior level
Employment type
Contract
Job function
Other
Industries
IT Services and IT Consulting#J-18808-Ljbffr
- Informações detalhadas sobre a oferta de emprego
Empresa: 1GLOBAL Localização: Viseu
Viseu, Viseu District, PortugalPublicado: 4. 1. 2026
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!