Infrastructure Engineer, Batch Data Platform
Remote, US/ Canada
Stripe’s infrastructure powers businesses all over the world. We process payments, run marketplaces, detect fraud, help entrepreneurs start an internet business from anywhere in the world, build world-class developer-friendly APIs, and more. If you’re an infrastructure engineer here, you’ll get to build the systems that power our products.
Stripe doesn’t process quite as many requests as Twitter or Facebook, but we do care a very great deal about reliability. Every request we process is very important to everyone involved! We can’t go down because our users’ businesses depend on us.
You’ll be on a team that maintains a product we provide to the rest of engineering, like storage, search or message queueing. You’ll make decisions with a significant impact on Stripe. There is a lot of work to do to make Stripe engineers’ work easier and our platform even more reliable than it is today, and we’d love for you to be part of it. We’re close to the people using our systems, so we constantly get feedback that we can use to make them better.
We have a few dozen infrastructure engineers today spread across several different teams, and you’ll work with other infrastructure engineers as well as product engineers who use the systems you’re building.
We’re looking for people with a strong background (or interest!) in systems. We’d love to hear from you whether you’re a seasoned systems developer, or whether you’ve just learned you might like working with databases. Many of our infrastructure engineers work remotely, and we’d be happy to talk to you about the possibility of working remote.
- Design, build, and maintain the core infrastructure used by all of Stripe’s engineering teams
- Debug production issues across services and levels of the stack
- Plan for the growth of Stripe’s infrastructure
- Build a great customer experience for people using your infrastructure
- To get a concrete idea of what projects you might work on here, see the “Projects you could work on” section
We’re looking for someone who has:
- Think about systems — their edge cases, failure modes, and lifecycles
- Know your way around a Unix shell
- Can debug complex problems across the whole stack
- Focus on the needs of our users, both internal and external
- Hold yourself and others to a high bar when working with production
- A metrics driven approach and can make informed decisions using data
- Are able to write high quality code in a programming language (e.g. Ruby, Scala, Go)
- It’s not expected that any single candidate would have expertise across all of these areas. For instance, we have wonderful team members who are really focused on their customers’ needs and building amazing user experiences, but didn’t come in with as much systems knowledge
Projects you could work on:
We have a ton of important work to do, which is why we’re hiring! Our projects are of course changing all the time, but here are a few projects either that we’ve done in the past, so you can get an idea of the types of work we do. Technologies we use include: haproxy, nginx, consul, jenkins, datadog, elasticsearch statsd, kafka, rabbitmq, storm, and others.
- Plan and implement multi-region availability for our distributed job queuing infrastructure! All of our systems can sustain losing machines, and making our systems even more resistant to failure is a big theme for us. If you like thinking about distributed systems, you might find a good home here
- Write easy-to-use and reliable client libraries for our Kafka or database systems. You’ll write abstractions and provide reasonable defaults around timeouts and error handling for a complex system
- Move us to a region with no downtime. Last year, we needed to migrate AWS regions, and we pulled it off with no negative effects on our users and no downtime
- Request tracing! Your mission: make it easier for any Stripe engineer, when debugging, to trace a request from its source down to every service it touched
- Build fantastic code review tools! If you love helping developers be more effective at their jobs, we have a ton of interesting projects in this area. Related projects: you could help us have better reproducible builds with Bazel and build great developer environments
- We have a bunch of projects around deploying and running code: help us instantly roll back bad deploys so that we can recover quickly, and build infrastructure that lets us scale up our API workers in seconds in response to high API load
- We need to scale our databases to handle 10x the load they can today. You could help us shard them more effectively, upgrade our database engines, and build great tools for developers so they can understand their slow queries more easily. A lot of our database projects are open source
- Build a seamless zero-downtime process to upgrade elasticsearch clusters. Our write-heavy workloads combined with our users’ need for reliability make this a unique challenge
Stripe is helping the internet fulfill its potential as a platform for economic progress by building software tools that accelerate global economic access and technological development. Stripe makes it easy to start, run and scale an internet business from anywhere in the world.
Stripe is, at its heart, an engineering company. To provide a missing pillar of core internet infrastructure, we hire people with a broad set of technical skills (and from a wide variety of backgrounds) who are ready to take on some of the most challenging problems in the industry – from reliably handling 100M API requests per day, to building adaptive machine learning as a result of years of data science and infrastructure work, and enabling entrepreneurs worldwide to start a global internet business.
We look at Stripe as a constant work in progress and the same is true of our people; for all of us, we believe the best is yet to come. We’re here to support each other in our curiosity and creativity – which we pursue through thoughtful discussion and knowledge-sharing among a diverse set of peers and colleagues.
We encourage all engineers to transition teams once every year and a half and also take on short-term projects with other teams across Stripe. This enables engineers to learn how different parts of Stripe work while also establishing stronger ties and cross-pollination between groups.
We contribute to existing open-source projects and the people working on them, and we release several tools as open-source.
We want to work in a company of warm, inclusive people who treat their colleagues exceptionally well. The kind of people who are committed to going out of their way to help other Stripes in the short-term and pushing them to improve over the long-term (by helping them to get better at what they do).
We’re a highly cross-functional organization and view that as part of the fun: we design our space to encourage as much collaboration as possible. We have long tables in the kitchen for a reason (to enable everyone to meet new people and learn from them). We also have a culture of transparency that we carry through to email communication, ensuring that Stripes all around the world have the information they need to make good local decisions.
In both our products and our people, we aim to reflect, represent and advocate for all of our users, globally. Our users transcend geography, culture and language; what we share, collectively, is a drive to create a fairer, more economically interconnected world.