Site Reliability Engineer
Primary purpose of the role
We're looking for a technically proficient and proactive Site Reliability Engineer (SRE) with a solid background in incident management, cloud infrastructure, and Kubernetes. You'll play a key role in ensuring operational excellence by aligning technology with business needs — building scalable,
- effective, and
- maintenance systems.
This role involves
- on incident resolution in production, deployment and optimization of Kubernetes-based environments, infrastructure support, and the development of custom monitoring and reporting tools. You'll also coordinate infrastructure activities across multiple locations, ensuring minimal disruption and high service availability.
Key responsibilities
- Incident Management and live troubleshooting
- Deploy and manage Kubernetes environments
- Perform Root Cause Analysis and create RCA documentation
- Infrastructure support, reporting and performance monitoring
- Coordination of weekend activities (migrations, audits, upgrades)
- Develop and maintain operational dashboards
Key relationships and knowledge
- Solid experience with Kubernetes
- Strong understanding of Linux OS
- Experience with cloud platforms (preferably AWS)
- Proficient in Python or Java
- Familiarity with networking protocols (DNS, DHCP, VLANs, routing, switching)
- Exposure to enterprise tools such as Azure, Active Directory, Microsoft 365, Slack, Outlook/Exchange, and AWS Workspaces
- Experience in deployment and infrastructure automation
- Strong
- solving skills and autonomy - Passion for technology – ideally demonstrated through personal or
- source projects
Remote role with occasional travel to Lisbon (once per month).
Excited? So are we Send us your CV and let's build the future together.
- Informações detalhadas sobre a oferta de emprego
Empresa: Nearshore Portugal Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 14. 10. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!