Senior Site Reliability Engineer
Posted 3 months ago
Mailchimp is a leading marketing platform for small business. We empower millions of customers around the world to build their brands and grow their companies with a suite of marketing automation, multichannel campaign, CRM, and analytics tools.
We are seeking an engineer to help us build and support the services that send over a billion emails per day and that millions of small businesses rely on. You’ll develop tooling, monitoring and processes to make our products highly available, performant, and reliable. You can expect to build dashboards, configure monitoring and alerting, write code for internal applications, and support incident resolution. You will work closely with members of our Engineering teams as an advocate for production-ready software. You will participate in an oncall rotation, and will lead efforts to resolve issues that other engineers-on-call encounter during their oncall rotations.
Our ideal candidate is comfortable picking up new technologies and frequently switching technical contexts. You value pragmatism over novelty in tooling and language selection. You have a deep understanding of how the different layers and components of a web application fit together. You’re passionate about reducing toil, automating wherever possible, and constantly working to improve the stability of a Mailchimp user’s experience. You've broken production environments—and then buckled down, fixed them and maybe you held a blameless post-mortem about it!
Many of our teams have at least one distributed member, so we encourage flexible working hours and some telecommuting but you should plan to be in the office most days. You will also participate in an on-call rotation that may require availability outside of business hours.
We'd love to hear from you if:
- You have experience driving technical projects, prioritizing work, identifying dependencies, performing new feature development, and facilitating technical decisions and cross-functional team discussions.
- You have experience scripting or coding in one or more programming languages. We primarily use PHP, Python and Golang, but prior experience with these languages is not a requirement.
- You’ve developed an intuition for acceptable performance and reliability benchmarks, and have experience transforming metrics and raw data into dashboards and actionable alerts using tools like Zabbix, Grafana, Prometheus and Kibana.
- You’re familiar with the goal of configuration management and have used a tool like Puppet to automate the delivery and operation of infrastructure.
- You have experience defining Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to help teams make informed decisions about balancing reliability against engineering velocity.
- You’re comfortable troubleshooting production incidents and debugging across interconnected systems and at multiple layers (including network, command line, and application).
- You embrace and demonstrate our core values of humility, creativity, and independence.
Mailchimp is a founder-owned and highly profitable company headquartered in the heart of Atlanta in the historic Ponce City Market, right on the Beltline. Our purpose is to empower the underdog, and our mission is to democratize cutting edge marketing technology for small business. We offer our employees an exceptional workplace, extremely competitive compensation, fully paid benefits (for employees and their families), and generous profit sharing. We hire humble, collaborative, and ambitious people, and give them endless opportunities to grow and succeed. If you'd like to be considered for this position, please apply below. We look forward to meeting you!
Curious how hiring has shifted at Mailchimp due to Covid-19? Click here to find out more!
Mailchimp is an equal opportunity employer, and we value diversity at our company. We don't discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.