Lead Site Reliability Engineer

Remote

Full Time
Zapier logo
Zapier
makes you happier :)
Apply now Apply later

Posted 1 week ago

Location: Eastern Europe, Middle East, or Asia

Hi there!

We are seeking a Lead Site Reliability Engineer to join our team in timezones between UTC +1 to UTC +5:30! As we continue to scale our product and grow our team, we’re looking for an experienced Lead SRE  to manage a small team distributed across time zones mentioned above. With a focus on leadership and mentorship, you’ll still play a hands-on role to help drive architecture, automation, performance, and reliability in our cloud-based infrastructure. As part of this team, you’ll be responsible for the core elements of our AWS infrastructure and orchestration.

As a leader, you’ll cultivate the talents of the team and ensure clear communication on project priorities. As a hands-on role, you’ll use site reliability principles and a robust approach to observability. You will not only fix problems but solve the issues that contributed to them when things go wrong. You'll improve application reliability by using a software engineering approach to operations. You'll develop internal tools and systems for all engineering teams to leverage. You'll get to impact every engineering team in the organization and use a broad set of technologies. Maintaining excellent relationships and communicating effectively with teams will be key to success.

Building new features and services is a big part of this role. We are continually developing and implementing new ways to support our teams, understanding our customers’ needs, and becoming experts in site reliability.

When bad things happen, you'll have the support of your team to solve contributing causes, learn from failures, and build a robust and resilient system for our customers. We look for the solution that automates the problem away, not the one that requires manual effort.

If you’re interested in making a big impact and taking our infrastructure to the next level at a fast-growing and profitable startup, then read on.

We know applying for and taking on a new job at any company requires a leap of faith. We want you to feel comfortable and excited to apply at Zapier. To help share a bit more about life at Zapier, here are a few resources in addition to the job description that can give you an inside look at what life is like at Zapier. Hopefully, you'll take the leap of faith and apply.

Zapier is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce.

About You

You’re an experienced technologist. You have at least 7 years of experience in the world of systems administration, systems engineering, or software development with at least 3 years of experience in Site Reliability Engineering, DevOps, or related field. 

You’re a strong leader. You lead by example and empathy, understand the business and how to prioritize project work, emphasize mentorship, and create an environment of learning. You understand that experimentation is key and know how to support your team through that process. You encourage diversity of thought and know that the team is collectively stronger than any individual. 

You’re a great communicator. Not only do you know how to share your knowledge with the team and document things well so they can be consumed asynchronous (we do this a lot as a remote company), but you know how to communicate effectively with software and support teams. 

You know the cloud. You’ve designed and maintained highly available, cloud-based infrastructures in AWS or another cloud offering. You understand how to leverage infrastructure as code tools, and have experience implementing best practices for reliability and observability.  We use tools like Terraform, Kubernetes, Redis, Gitlab, and Datadog among others. 

You can code. You have experience with a language like Python or Go to create automated tools. You believe in hands-off deployments and infrastructure as code. Well-honed experience with the fundamentals of software development goes a long way here.

You can solve complex systems challenges. You take ownership of complex challenges, understand how to improve performance, and help uncover opportunities for improvement. You’ve worked on problems where “just throw more hardware at it” isn’t enough for the system to scale.

You value our values. At Zapier, our values are at the heart of how we work together and how we think about our customers. In our remote setting, they help develop trust and ensure we work and collaborate together to democratize automation. You see how these values can empower meaningful work, you thrive in a collaborative setting, you are eager to continue growing and you’re excited to be part of the team. 

Things We've Done Recently:

  • Migrated services from EC2 to Kubernetes, and deployed new microservices
  • Written custom Kubernetes controllers to improve resilience
  • Created deployment pipelines in ArgoCD
  • Developed autoscaling strategies to handle millions of requests daily
  • Deployed ProxySQL for pooling connections against MySQL databases

About Zapier

Zapier helps people across the world automate the boring and tedious parts of their job. We do that by helping everyone connect the web applications they already use and love.

We believe that there are jobs a computer is best at doing and that there are jobs a human is best at doing. We want to empower businesses to create processes and systems that let computers do what they are best at doing and let humans do what they are best at doing.

We believe that with the right tools, you can have big impact with less hassle.

We believe in small teams. Small teams are fast and nimble. Small teams mean less bureaucracy and less management and more getting things done.

We believe in a safe, welcoming, and inclusive environment. All teammates at Zapier agree to a code of conduct.

The Whole Package

We're currently hiring for the following locations:

  • Most countries between UTC +1 to UTC +5:30 time zones. 

Compensation:

  • Competitive salary (we don't use remote as an excuse to pay less)
  • Profit-sharing
  • 2 annual company retreats to awesome places
  • 14 weeks paid leave for new parents of biological or adopted children
  • Pick your own equipment. We'll set you up with an Apple laptop + monitor combo you want plus any software you need.
  • Unlimited vacation policy. Plus we require you to take at least 2 weeks off each year. We see most employees take 4-5 weeks off per year. This isn't a vague policy where unlimited vacation means no vacation.
  • Work with awesome companies around the world. We partner with great software companies all over the world and you'll constantly get to interact with people from these great companies 

How to Apply

We have a non-standard application process. To jump-start the process we ask a few questions we normally would ask at the start of an interview. This helps speed up the process and lets us get to know you a bit better right out of the gate. Please make sure to answer each question.

After you apply, you are going to hear back from us, even if we don't seem like a good fit. In fact, throughout the process, we strive to make sure you never go more than seven days without hearing from us.

Zapier is an equal opportunity employer. We're excited to work with talented and empathetic people no matter their race, color, gender, sexual orientation, religion, national origin, physical or mental disability, or age. Our code of conduct provides a beacon for the kind of company we strive to be, and we celebrate our differences because those differences are what allow us to make a product that serves a global user base.

Job tags: AWS EC2 Gitlab Go Kubernetes MySQL Python Redis Reliability engineering Terraform Web applications
Job region(s): Remote/Anywhere
Job stats:  4  0  0
Share this job: