Site Reliability Engineer
At Bloq. it, we’ve created the world’s leading smart locker solution, solving online deliveries by enabling easy participation, reducing costs, and promoting sustainability.
We’re expanding rapidly, growing at 1000% for three consecutive years, making us the
- growing Smart Locker company globally and one of Europe's
- growing
- ups.
We are seeking a Site Reliability Engineer to join our innovative team as our #bloqstar. This role is crucial for maintaining the health, stability, and performance of our production systems. It is ideal for a highly technical engineer skilled in troubleshooting complex issues, collaborating across teams, and developing observability and monitoring solutions. As part of the 3rd level support team, you will investigate and resolve escalated issues impacting system availability, performance, and reliability.
What You’ll Be Doing
- Provide expert troubleshooting and incident management for escalated production issues, including performance degradation, outages, and anomalies.
- Diagnose and resolve complex issues across infrastructure, applications, and services, working closely with development teams to find root causes.
- Collaborate with operations, development, and security teams to enhance system reliability, scalability, and availability.
- Maintain and improve system observability tools, ensuring effective monitoring, alerting, and logging.
- Develop and update runbooks, incident response protocols, and technical documentation.
- Automate repetitive tasks to improve operational efficiency.
- Define and implement incident response processes, including root cause analysis and
- mortems.
What You’ll Bring To The Table
- At least 3 years of professional experience as a Site Reliability Engineer or similar role.
- Strong expertise in monitoring and observability tools (e. g. , Prometheus, Grafana, Datadog, Elasticsearch, Kibana, New Relic, Open
Telemetry). - Experience with No
SQL databases such as Mongo
DB or Elasticsearch is a plus. - Deep understanding of incident management,
- mortem analysis, and
- call best practices. - Experience with AWS cloud platform.
- Proficiency in creating automations and tooling.
- Strong skills in Python and Java
Script/Type
Script. - Expertise in Unix/Linux debugging using commands like grep, awk, sed, strace, tcpdump, lsof, journalctl, etc.
- A
- solving mindset with a
- driven approach to resilience engineering.
It Would Be Great If You Also Have
- Experience setting up SRE practices from scratch.
- Experience with products combining hardware and software.
- Background in
- growth,
- driven startups. - Familiarity with ITIL or incident management frameworks.
- Experience implementing error budgets and reliability SLAs.
- Knowledge of Kubernetes and containerization.
Why Join Us?
- Be part of our Software team, contributing to innovative solutions that are revolutionizing the smart locker industry.
- Enjoy a dynamic,
- paced environment fostering innovation, collaboration, and continuous learning. - Competitive salary and benefits tailored to your skills and experience.
- Eligibility for a
- based bonus rewarding your impact. - Work remotely with flexible hours to maintain
- life balance. - Portuguese Health Insurance.
- Unlimited days off (subject to manager approval).
Ready to join the revolution?
At Bloq. it, we provide
-
- end solutions for Smart Lockers. Our flagship software ecosystem, Bloq. it OS, is the leading market solution.
We’ve partnered with major clients like Vinted and DHL across Europe.
Having recently become the
- growing Smart Locker company worldwide, we aim to transform the industry as Tesla did with cars. We believe Smart Lockers will become as mainstream as mobile phones, and we want to lead this change.
- Informações detalhadas sobre a oferta de emprego
Empresa: Phiture Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 18. 8. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!