Site Reliability Engineer
Join to apply for the Site Reliability Engineer role at Shield
1 week ago Be among the first 25 applicants
Join to apply for the Site Reliability Engineer role at Shield
Description
Shield is a global startup, with offices in TLV, NYC, LDN, and LIS.
Description
Shield is a global startup, with offices in TLV, NYC, LDN, and LIS.
We’re rapidly growing and looking for another important piece of the puzzle.
Is it you?
We are looking for a Site Reliability Engineer to join our team!
Let’s Get Down To Business
What you'll do:
- Design and maintain scalable, reliable AWS infrastructure.
- Monitor health, performance, security, and capacity of production environments.
- Develop and manage monitoring, alerting, and logging systems for proactive issue resolution.
- Review and refine existing alerts, collaborating with developers to automate responses and enable
- healing systems. - Develop and maintain monitoring dashboards that provide clear and actionable insights into application reliability, system performance, and capacity utilization.
- Conduct capacity planning and performance tuning to optimize system performance and resource utilization.
- Fine-tune efficiency tools (e. g. , Karpenter, KEDA, HPA) based on workload patterns.
- Automate repetitive tasks and processes to streamline operations and improve efficiency.
- Manage routine operational activities, including log reviews, system checks, and verification of automated processes.
- Participate in incident response and resolution, including rapid troubleshooting, root cause analysis, and contributing to
- mortem reviews. - Maintain and improve incident response procedures and runbooks to ensure efficient and effective handling of incidents.
- Continuously evaluate and adopt new technologies and methodologies to enhance infrastructure and operations.
- Continuously monitor resource utilization and cloud expenditure within all production environments.
- Implement operational cost optimization measures, such as rightsizing resources based on utilization data and terminating orphaned/unused resources.
- Monitor and manage capacity utilization and cloud service quotas in production environments to ensure availability and performance.
- Identify and remediate (or escalate) configuration drift from security and compliance baselines.
Experience and Skills:
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- 4+ years of experience as a site reliability or platform engineer, preferably in a
- scaling environment. - Hands-on experience with Terraform and Terragrunt.
- Extensive knowledge of Kubernetes and containerization technologies.
- Hands-on experience with the Prometheus stack.
- Ability to design and develop code using Python or Go.
- Strong inclination toward automating manual tasks and processes to improve operational efficiency.
- Excellent troubleshooting abilities with a methodical approach to diagnosing and resolving issues.
- In-depth knowledge of cloud services, particularly AWS, including best practices in security and compliance.
- Strong communication skills to collaborate effectively with both technical and
- technical stakeholders.
So, in case you were wondering, Shield is how compliance teams in financial services can finally read between the lines to see what their employee communications are really saying.
Our platform analyzes digital interactions to fight financial crimes and mitigate a toxic workplace environment.
Shield is a post Series B startup ($35M) with some of the largest financial organizations in the world as investors and customers.
Shielders listen more intently. Pay closer attention to the details. Make the extra effort. Care. It’s what we do at Shield every day. And not just for our customers, but for everyone we work with. It’s all about creating a world where people understand and trust each other.
Shield is set to do good in the world, we help protect market integrity and people’s financial assets.
Seniority level
Seniority level
Mid-Senior level
Employment type
Employment type
Full-time
Job function
Job function
Engineering and Information Technology
Referrals increase your chances of interviewing at Shield by 2x
Sign in to set job alerts for “Site Reliability Engineer” roles.
Dev
Ops/Site Reliability Engineer (Lisbon-Remote)
ACT DIGITAL | Site Reliability Engineer Azure | EN (B2)
Python Backend Junior Software Engineer - Remote 4 days a week (Europe)
Lisboa, Lisbon, Portugal $30, 000. 00-$40, 000. 00 3 weeks ago
Summer Internship - Site Reliability Engineer (f/m/x)
Senior Site Reliability Engineer - Dev
Ops
Senior Site Reliability Engineer - Commerce (f/m/x)
Full
Stack Software Engineer (React/Node)
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr- Informações detalhadas sobre a oferta de emprego
Empresa: Shield Localização: Lisboa
Lisboa, Lisboa, PortugalPublicado: 18. 6. 2025
Vaga de emprego atual
Seja o primeiro a candidar-se à vaga de emprego oferecida!