Site Reliability Engineer
Lisboa
Lisboa, Lisboa, Portugal

Role: Sr SRE (Application Support + Automation)

Location: Lisbon, Portugal

Experience: 5-7 years

Who are we

Fulcrum Digital is an agile and
- generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing.

The Role

- Plan, manage, and oversee all aspects of a Production Environment

- Define strategies for Application Performance Monitoring, Optimization in Prod environment

- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.

- Support deployment of code into multiple lower environments. Supporting current processes with an emphasis on automating everything as soon as possible.

- Design, develop and standardize Monitoring and Alerting mechanism for the supported applications.

- Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.

- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.

- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.

- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.

- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in Dev
Ops automation and best practices.

- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.

- Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.

- Work with a global team spread across tech hubs in multiple geographies and time zones.

- Ability to share knowledge and explain processes and procedures to others.

- Share knowledge and mentor junior resources

- Able to perform
- call duties on a rotational basis.

- Occasional off hours work required.

- Candidate should have inclination for Training and should be good trainer and ready to mentor other

Requirements

Must Have

  • L2 Application Support - Strong
  • Linux
  • SQL
  • Monitoring Tool - Splunk/Dynatrace or Other
  • Troubleshooting

Good to have

  • Jenkins – Basic
  • CI/CD - Basic
  • Shell scripting – Basic
  • ITIL/ITSM process

Responder ao anúncio
Seja o primeiro a candidar-se à vaga de emprego oferecida!
0.1226