Senior Site Reliability Engineer


Full Time Senior-level / Expert
Everbridge logo


Critical events happen every day that threaten safety, interrupt supply chains, and disrupt operations. Rapidly pinpoint threats and automate response.

View all employer listings

Apply now Apply later

*Candidates can work remotely in BC, ON, or QC
xMatters, an Everbridge company, is a service reliability platform that helps DevOps, SREs, and operations teams rapidly deliver products at scale by automating workflows and ensuring infrastructure and applications are always working. Our code-free workflow builder, adaptive approach to incident management, and real-time performance analytics all support a single goal: deliver customer happiness. Over 2.7 million users trust xMatters daily at successful startups and global giants, including 3M, Accenture, Athena Health, Box, Credit Suisse, HSBC, NVIDIA, PepsiCo, Tesco, ViaSat, and Vodafone. As an Operations Engineer, you will be responsible for following best practices when it comes to managing cloud infrastructure at scale. Your top priority is to ensure production systems continue to maintain a four nine’s uptime. With your experience, you will help push performance, automation, reliability, as well as security in our cloud-hosted infrastructure. You will also work closely with our software engineering teams to educate and help release stable microservices. We operate our systems with mature automation and robust engineering practices. A workday can range from hosting a workshop on the latest tech tool to building out sophisticated backend systems using Infrastructure-as-Code and industry best practices. 

What you'll do:

  • Maintain and develop solutions to support CI/CD at scale.
  • Bring innovative ideas and concepts to the table.
  • Develop processes and tools for optimizing security.
  • Monitor, forecast, and create capacity plans for services.
  • Ensure backup and recovery systems are functional, tested and monitored.
  • Utilize automation for repeatable tasks.
  • Educate and support engineers on best practices.
  • Participate in a rotating 24x7 on-call group.

What you'll bring:

  • Desired languages: Python, Bash, Groovy, Go
  • Wide knowledge of DevOps best practices.
  • Strong teamwork skills, across Operations and Engineering, including working with Senior Management.
  • A passion for maintaining and writing technical documentation.
  • Desired technical experience:
  • Google Cloud Platform / Amazon Web Services
  • Jenkins / Gitlab
  • Kubernetes
  • Helm or similar templating libraries
  • Git
  • Ansible
  • Centralized logging systems
  • Monitoring & alerting tools
  • A solid understanding of traditional infrastructure services:
  •   SMTP
  •   DNS
  •   Load Balancing
  •   TCP/IP Networking
  • HashiCorp product suite:
  • Terraform
  •     Packer
  •      Consul
  •      Vault
Bridger Culture: 
At Everbridge, we have a mission that matters – to keep people safe and businesses running during critical events. Our “Bridgers” join Everbridge to make a positive impact on the world through their work. The core of our company culture is built around making a difference. Our people are dedicated to solving problems during difficult times and challenging situations as our software was built to save lives. We are a rapidly growing organization transforming the field of critical event management and need passionate, committed and determined individuals to help us carry out our mission. Our environment is dynamic, and our culture is constantly evolving and expanding in order to provide the best employee experience. Click here to learn more about what we do. Passionate about our mission? Want to #BeTheBridge? Apply to be a part of our team today! Everbridge is an Equal Opportunity/Affirmative Action Employer. All qualified Applicants will receive consideration for employment without regard to race, creed, color, religion, or sex including sexual orientation and gender identity, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.
Job perks/benefits: Team events
Job region(s): Remote/Anywhere North America
Job stats:  2  0  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities