Site Reliability Manager


Vonage logo
Apply now Apply later

Site Reliability Manager

Vonage Engineering Mission: Vonage is the emerging leader in the $100B+ cloud communications platform (CPaaS) market. Customers like Airbnb, Viber, Whatsapp, Snapchat, and many others depend on our APIs and SDKs to connect with their customers all over the world. As businesses continue to shift to a real-time, customer-centric communications model, we are experiencing a time of impressive growth.

Why this role matters:

Vonage API is growing fast and is always working on new products. That means we continuously need to improve the collaboration between developers and operations. This is where our SRE teams come in, providing the tools and automation for the smoothest pipeline from the developer laptop to the production servers, helping the QA team provide fast and accurate results, exploring new technologies to improve reliability and efficiency of our services. We are looking for a SRE Manager to join our team and work on our pipelines, explore new technologies and find reliable and scalable solutions to problems or limitations of our current environment.


What You Will Do:

  • Drive a team of individuals to to solve problems, build new services and own the shared infrastructure services that the company relies on to deliver our production environment
  • Investigate and troubleshoot, hard and complex problems
  • Proactively make propositions for improvements in the software stack, and follow them up to completion
  • Write and review code and configuration
  • Improve the visibility and monitoring of the production environment
  • Propose, evaluate and integrate new tools
  • Participate in an on-call rotation
  • Partner with the development teams, to help them improve the scalability and reliability the services they own

What You Will Bring:

The list of skills and requirements below functions as a taster of what our team gets involved with, and we do not by any means expect one person to have exposure to all of it. If the team and Vonage sound interesting to you, and you have a true passion for SRE, we invite you to apply no matter your current level and discuss opportunities with us further!

Must haves:

  • 3-5 Years of management experience
  • The ability to manage a team workflow (Agile or Kanban)
  • Experience with cloud (AWS)
  • Experience with databases (RDS, Aurora, MySQL)
  • You have a strong background in Linux and system administration
  • Good scripting skills (python, bash, groovy, ruby, etc)
  • Experience with configuration management (chef) and source code management (Git)
  • Experience with infrastructure-as-code frameworks, such as Terraform and CloudFormation
  • CI/CD using tools such as jenkins
  • Experience with container-based software toolchains, including k8s ● Experience working with Redis, Kafka, ElasticSearch

Nice to haves:

  • Experience with foundational infrastructure services, such as DNS and LDAP
  • Knowledge of monitoring, graphing and logging tools (Nagios, Grafana, Prometheus, Kibana, filebeat)
  • Understanding of developer’s tools and languages (gradle, java, jenkins, Go, ...)
  • Experience with web servers (nginx, apache2, tomcat, haproxy)
  • Ability and willingness to work in a global, fast-paced environment
  • Flexible with the ability to adapt working style to meet objectives
  • Excellent communication and analytical skills
  • Ability to effectively communicate with team members
  • Masters / Bachelors degree in Computer Science, or equivalent experience
Job region(s): Asia/Pacific
Job stats:  1  0  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities