Senior Engineer, Service Reliability Engineering


Full Time Senior level / Expert
Auth0 logo
Apply now Apply later

Posted 1 month ago

Auth0 is a unicorn that just closed a $120M Series F round of funding, with total capital raised to date of $330M and valuation of nearly $2B. We are growing rapidly and looking for exceptional new team members to add to our exceptional talent pool - and who will help take us to the next level of success. One team, one score. 
Our vision is to provide people with secure access to any application in one click or less. And our promise is to make identity work for everyone—whether you’re a developer looking to innovate, or a security professional looking to mitigate. We are looking for curious, excited, boundary-pushing team members. So, if you’re a big thinker who is nimble and adaptable, Auth0 may be an ideal place for you to shine.
Auth0 gives companies simple, powerful and developer friendly building blocks so they can free up resources to focus on innovation. We strive to be the identity platform of choice for developers and Enterprises. We take our culture very seriously and are looking for people who are drawn to both our mission and our culture.
The Auth0 platform processes thousands of requests per second (3 billion logins per month) for customers all around the world - and we're growing fast! The Service Reliability Engineering team is aimed at improving reliability and uptime in a data-driven way to support internal and external customers' needs.
We are looking for engineers with a good understanding of how systems fail, and a passion for helping us recover from and learn from our failures.

You are a good fit if you:

  • Have initiative and can unblock yourself to get things done.
  • Tend to deliver work incrementally to get feedback and iterate over solutions.
  • Pair with team members and other teams; collaboration is a very important part of this role.
  • Like to get your hands dirty by debugging and fixing issues in production.
  • Understand the real problems by reading between the lines and asking good questions.
  • Are easy to work with: you communicate well, take feedback in a positive way and are OK not always doing the most glamorous tasks.
  • Can work well in a fully-distributed team.
  • Are comfortable taking charge during incidents.


  • Collaborate with other Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Scale systems sustainably through automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Be on-call for services that the SRE team owns.
  • Practice sustainable incident response and blameless postmortems.
  • Take command of high-severity incidents and facilitate their resolution.


  • You have excellent written communication skills.
  • You are interested in designing, analyzing and troubleshooting large-scale distributed systems.
  • You have a systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
  • You have a great ability to debug and optimize code, and automate routine tasks.
  • You have designed applications and systems that scale, are resilient to failure, and are observable.
  • You are comfortable participating in an on-call rotation and taking charge of potentially stressful situations.
  • You have practical experience developing and improving applications written in Node.js or Go.
  • You live in a timezone located in GMT-8 to GMT+2 (we are giving preference to candidates who align with the existing team).

Extra points:

  • Prior participation in Incident Command or Incident Response rotations.
  • Experience with Amazon Web Services or Microsoft Azure.
  • Experience with Linux.
  • Experience with Python or other languages besides Node.js and Go.
  • Experience with MongoDB or PostgreSQL.
  • Experience working in a remote friendly, async environment.
  • Experience with LightStep, DataDog or other observability systems.

Preferred Locations:

  • #CA; #EU;
Auth0 safeguards more than 4.5 billion login transactions each month and its top priorities are availability and security.
We like to think that we are helping make the internet safer. Our team is spread across more than 35 countries and we are proud to continually be recognized as a great place to work. Culture is critical to us, and we are transparent about our vision and principles
Auth0 is an Equal Employment Opportunity employer. Auth0 conducts all employment-related activities without regard to race, religion, color, national origin, age, sex, marital status, sexual orientation, disability, citizenship status, genetics, or status as a Vietnam-era special disabled and other covered veteran status, or any other characteristic protected by law. Auth0 participates in E-Verify and will confirm work authorization for candidates residing in the United States.
Job tags: Azure Go JS Linux MongoDB Node Node.js PostgreSQL Python Reliability engineering
Job region(s): Remote/Anywhere
Share this job: