Site Reliability Engineer

San Francisco and Mountain View

Airtable logo
Apply now Apply later

Posted 3 weeks ago

Airtable is a mission-critical system for a diverse set of teams and industries. We build our platform to scale, stay resilient, and deliver delightful user experience all around. Our infrastructure requires thorough thinking, deep research into how things work, and rigorous coding. Our mission is ambitious, but we like to keep our infrastructure simple.

As one of the first dedicated site reliability engineers at Airtable, you will play a critical role in scaling and refining our operational practices. Site reliability engineering begins with building solid automation across the software delivery process, including configuration, provisioning, testing, deployment, and beyond. SREs will also work with software engineers to help understand the way their code behaves in production, and build nontrivial internal tooling to enable this. We also strive for a strong security posture, and SRE will help define and implement operational practices that protect our users. Lastly, of course, our operations team is the last line of defense when incidents happen, and SREs will be part of the team that responds to them.

What you'll do

  • Automate everything: deploys, rollbacks, database provisioning, failovers, and everything in between.
  • Design and implement monitoring tooling across the stack, and optimize systems for uptime, performance, and reliability based on the data gathered by this tooling.
  • Design and write tests that investigate how our infrastructure handles failure and scaling.
  • Research hot-off-the-press CVEs and implement best practices.
  • Build occasional product features as appropriate.
  • Write solid, maintainable code (including a lot of JavaScript) for all of the above.
  • Manage our Elasticsearch cluster

Who you are

  • You're painfully thorough, whether it's scripting bulletproof deployment automation, writing a recovery playbook that an engineer can follow without fail at 3 a.m., or digging into logs and monitoring data to find the root of a problem.
  • You've worked with Linux, containers/namespaces, and system automation tools for Unix and cloud platforms.
  • You have 5+ years of relevant technical experience, including significant experience with site reliability/devops or server infrastructure engineering.
  • You're OK carrying a pager and take it seriously, but you take pride when the pager hasn't rung in the past week.

What we offer

  • Health care: we have you 100% covered (and your dependents 50% covered) with competitive medical, dental, and vision insurance. You'll also be eligible for a complimentary membership to One Medical Group
  • Learning & Development: we offer a $2,000 per year stipend for your personal career development
  • Gym Membership: we’re proud to provide employees in our San Francisco and New York offices with complimentary gym memberships to Equinox, or up to $100/month reimbursement towards any other gym
  • Catered lunches: we have high-quality catered lunches every day and well-stocked kitchens. We'll also reimburse you for any reasonable food expenses incurred while working
  • Generous PTO, sick leave, and parental leave

About Airtable

Airtable's mission is to democratize software creation. We believe that software stands to be the single most impactful way anyone can bring their ideas to life, yet that few people can actually access it as a creative medium. Airtable enables everyone to experience the power of creating, not just using, software. Headquartered in San Francisco, Airtable has raised $170M in venture funding to date, most recently a 100M Series C from Benchmark, Thrive, and Coatue.

Job tags: C Elasticsearch JavaScript Linux Reliability engineering Unix