Site Reliability Engineer (Cloud Network Automation)

TEMP-TEAM PTE LTD

Location: Singapore

Industry: Cloud Networking, Enterprise Connectivity, IoT and Smart Technology Solutions

About the Role

We are seeking a Senior Site Reliability Engineer to support and optimize our client's cloud-managed networking and connectivity platform serving SMB and enterprise customers across the region. The successful candidate will be responsible for ensuring the reliability, scalability, security, and operational excellence of cloud-based networking services, including wireless networking, switching, routing, SD-WAN, surveillance management platforms, and cloud-native applications.

Key Responsibilities

Cloud Platform Operations

  • Manage and maintain cloud-based networking platforms deployed across AWS, Azure, GCP and private cloud environments.
  • Ensure high availability, performance, and reliability of cloud-managed networking services.
  • Monitor and optimize system health, resource utilization, latency, and customer service performance.

Site Reliability Engineering

  • Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs).
  • Develop proactive monitoring, alerting, and observability solutions.
  • Conduct root cause analysis and drive permanent corrective actions following incidents.
  • Participate in an on-call rotation schedule and support critical incident management when required.

Cloud Infrastructure & Automation

  • Design and implement automation solutions using Python, Bash, or Go.
  • Improve deployment pipelines and operational processes through Infrastructure-as-Code and CI/CD methodologies.
  • Support Kubernetes-based containerized applications and cloud-native services.

Customer Platform Support

  • Work closely with enterprise and SMB customers to troubleshoot cloud platform issues.
  • Support large-scale deployments involving: Cloud-managed WIFI, Enterprise Switching, SD-WAN, Cloud Security, Video Surveillance Platforms, IoT Device Management

Security & Compliance

  • Ensure platform compliance with security standards and best practices.
  • Support IAM, network security, vulnerability management, and cloud security initiatives.
  • Participate in disaster recovery planning and business continuity testing.

Requirements

  • Bachelor's degree in computer science, Information Technology, Engineering, or a related field from a recognized university or institution.
  • Proven experience in Site Reliability Engineering, Cloud Operations, Platform Engineering, or DevOps, preferably in enterprise cloud or networking environments.
  • Hands-on experience with: Kubernetes, Docker, Linux Administration, AWS, Azure, or GCP, CI/CD pipelines, Infrastructure Automation, Networking Knowledge
  • Strong understanding of: TCP/IP/DNS/Routing & Switching/Wireless LAN/SD-WAN/VPN Technologies/Network Monitoring & Troubleshooting/Programming &Automation
  • Experience with: Python/Bash/Go (preferred)/PowerShell (optional)
  • Experience supporting enterprise networking, managed services, cloud platforms, telecommunications, or related technology environments would be advantageous.

If you're enthusiastic about this role, send us your resume and a note sharing your relevant experience to ***email_hidden***.

Recruitment Manager: Shirley Chong Ai Ling (Ning)

R1325699

EA 01C3135