Sr. Site Reliability Engineer
New York, NY
Beeswax is looking for a Senior Site Reliability Engineer to join our growing team. We were recently recognized as #46 on the Inc. 5000 list of Fastest Growing Companies in America, and #5 among all software companies. In 2018, we were also named by Business Insider as the “fastest growing company in AdTech.
At Beeswax, we make great advertising software. Beeswax created the Bidder-as-a-Service™ concept because we believe that every advertiser should have power, flexibility, and transparency when buying advertising programmatically. So come join the team and help us take our customers into the next generation of Real-Time Bidding.
Our company is an easy to use, massive scale and high availability advertising platform founded by industry veterans who worked together at Google. We’re well funded by leading VCs, such as RRE and Foundry Group, and are rapidly expanding our customer list and our engineering team. We offer our customers the most extensible and transparent advertising system in the world and process millions of transactions per second.
Our engineers come from major tech companies such as Amazon and Facebook as well as many other companies with strong software disciplines. Building and selling great advertising software that we’re proud of is the absolute heart of our mission.
At our transaction volumes we regularly deal with scaling challenges as our tech grows and evolves. To manage the firehose of data coming in, we explore complex tradeoffs and carefully architect high performance distributed systems. Those in turn require elegant and thoughtfully designed APIs to make the systems accessible to both our team and our customers.
As a Senior Site Reliability Engineer at Beeswax, you will be responsible for the performance, reliability, and security of revenue-critical systems serving upwards of 2 million QPS with latency thresholds of 100ms. You will design and maintain the global infrastructure supporting millions of dollars in revenue.
Our ideal candidate will have both systems and software backgrounds. We are on AWS and therefore experience with AWS is a major plus.
“What would I do in the first three months?”
Glad you asked! Here’s some of the things we’re working on in the next few months:
- Designing our next generation of observability tooling, taking us from what we have now to something capable of tying together metrics, logs, and traces.
- Overhauling our (very large!) scale loadbalancers, exploring envoy and other technologies for the second generation of these.
“How about the first year?”
We’ve no end to projects the SRE team would love to work on:
- Improve our Kubernetes platform, exploring additional technologies like service meshes, improved loadbalancing, autoscaling improvements, taking advantage of spotfleets.
- Work with other engineers to rearchitect and rebuild some of the core services their teams rely on to be more efficient and cost effective.
- Standardizing our approaches to observability so it’s easy for a developer to do the right thing.
- Move our terraform automation to be backed by terraform cloud
This is just some of the things we plan to work on over the next year, but we hope you’ll bring your knowledge and expertise and propose projects of your own to help us continually improve.
Who You Are:
- At least 7 years of experience in the infrastructure space
- Expertise in large scale cloud environments (at least thousands of servers)
- Expertise in designing, analyzing and troubleshooting large-scale distributed systems.
- Strong experience in Observability (monitoring and alerting) at large scale.
- Hands-on experience with AWS.
- Bonus Points
- Experience with high performance networking
- Experience with low level performance tuning (kernel, knowing how to use ebpf tools to understand what to tune, etc)
Successful Engineers at Beeswax Have:
- An ethic of service and a belief in putting the customer first
- A powerful sense of pragmatism to figure out what needs to be done right versus right now
- A curiosity about technology and a desire to use it to solve problems in all sorts of domains
- An openness to feedback and more than just the spelling skills to know that there’s no I in Team
- An appreciation of repeatability, resilience, observability, and operational simplicity