An exciting Senior, Site Reliability Engineer job has just been made available at one a financial service company based in Kuala Lumpur.
About the Senior, Site Reliability Engineer Role: Reporting directly to the Team Lead, you will be responsible for maintaining a healthy production environment, ensuring its availability, and improving system performance. This involves automating monitoring, pinpointing root cause, and preventing future issues. Analysing metrics from both operating systems and applications, working closely with development teams to enhance service quality.
Establish and maintain infrastructure and application monitoring. Create alerts and automate recovery processes for operational issues
Collect and analyse metrics from both operating systems and applications. Provide guidance for performance optimisation and fault diagnosis
Analyse data to identify errors, trends, and complex problems. Respond to escalated incidents and proactively prevent future incidents through monitoring and analysis
Develop preventive measures and automated recovery methods for potential failure scenarios
Collaborate with development teams to enhance services. Adapt to new tools and technologies such as Azure DevOps, Grafana, Dynatrace, etc. Prioritise knowledge sharing and documentation to ensure teams have access to critical information
Optimise aspects of the SDLC and incident management for improved service reliability
Offer guidance to application teams on establishing recovery processes
To succeed in this Senior, Site Reliability Engineer role, you must have a minimum of over six years' experience in an IT/DevOps/SRE role.
Bachelor’s degree in Computer Science or a related field is required. Over six years' experience in an IT/DevOps/SRE role. Proficiency in applying Agile and lean methodologies to IT operations
Highly skilled in databases technologies, programming languages (.NET, C#, C++, Java 8, Python), Linux and Windows Server operating systems, and scripting language
Understanding of open-source distributed version control systems, e.g. GIT. Strong knowledge of REST API principles
Experience with Atlassian tools, familiarity with Azure Cloud services
Previous experience working with ITIL in an Agile environment
Skilled in containerization, CI/CD, dashboard development and/or Terraform or Ansible is an added advantage. Azure or AWS cloud certifications are an advantage.
The scope of the offer, the size of the business, the freedom and autonomy to drive your career forward all add up to a great place to work.
If you have a successful track record in DevOps/SRE within enterprise environments, you can take your career forward with this exciting Senior, Site Reliability Engineer job.
Apply today or e-mail me at Sarah.Nunis@robertwalters.com.my to discuss this new opportunity.
Do note that we will only be in touch if your application is shortlisted.
Agensi Pekerjaan Robert Walters Sdn Bhd Business Registration Number : 729828-T Licence Number : JTKSM 423C
An exciting IT Security Governance Specialist job has just become available at the nation’s largest financial services group. You will be responsible for the group’s IT policies and standards across all businesses.