Infrastructure Engineer

RECRUIT EXPRESS PTE LTD

Successful candidates need to be comfortable with working on weekends, and rotation shifts as and when required.

Key Responsibilities

  • Manage high severity incidents and high customer impact incidents focusing on fast recovery
  • Champions production resilience and availability, focusing on superior client experience, by working with the operation team and technology development teams
  • Drive the implementation of Site Reliability Engineer (SRE) and Chaos Engineering design for all strategic systems
  • Drive effective communication between business and technology with regards to production service reliability and performance
  • Drive continuous improvements in processes or systems leveraging Site Reliability Engineering methods
  • Respond to, evaluate and analyze production incidents to minimize their impact as well as devise innovative solutions to prevent them in the future
  • Improve the reliability and availability of systems by gathering hard data, designing systems for increased service reliability and performance
  • Provide expert advice and training to our engineers as to which technology solutions and advanced reliability techniques to use on each situation
  • Any other ad-hoc duties as assigned by supervisors

Requirements

  • Bachelor's degree in Computer Science or related field
  • Around 10+ years of relevant experience
  • Experience driving major production incidents and organize incident retrospective meetings
  • Experience with Core Java 8, Cloud Foundry and non-relational databases, and Linux, Unix systems
  • Experience with high availability, high-scale, and performant systems
  • Experience with python and Unix scripting

Interested applicants, please email your resume to Douglas Tan Yu Feng

Email: ***email_hidden***

CEI Reg No: R26160004

EA Licence No: 99C4599

Recruit Express Pte Ltd