Site Reliability Engineer
San Francisco / Remote
About the company
Remind, the leading communication platform in education, helps educators reach students and parents where they are: their phones. With over 30 million active users, we’re one of the fastest-growing companies in education technology, but we have our sights set on something bigger: giving every student the opportunity to succeed.
About this role
The Remind Engineering Team collaborates to deliver features for our users and customers while setting and maintaining SLAs to ensure reliable system performance. We prefer strongly typed languages over dynamic for critical business systems, and leverage both relational and non-relational data structures as needed, supporting tens of thousands of requests per second. We bias towards using the right tool for the job, including Typescript, Python, Go, Ruby, Twirp, GraphQL, and many AWS services (Aurora, Lambda, DynamoDB, SQS, Kinesis).
As a Site Reliability Engineer at Remind, you'll collaborate with our product engineering teams, as well as cross-functional teams, to maximize site availability, performance, and uptime, as well as build systems and features to enable engineers to ship more quickly and more confidently.
Not in San Francisco? No problem! Our team is distributed within +/-3 hours of Pacific Time.
- You have consistently shipped high quality code to production as part of a team
- You collaborate effectively with engineers and product managers to build systems to increase the leverage of product engineering teams and improve the security, stability, and efficiency of production systems
- You write clean code and have significant experience with one or more programming languages
- You understand the value of an appropriately defined SLA/SLO for both internal and external systems and services, and have experience building highly available systems and services which scale and perform in accordance with such an SLA/SLO
- Others enjoy working with you because of your positive attitude and technical competence
What you'll do:
- Increase the overall availability and performance of our distributed services
- Support uptime through participation in our eng-wide on-call rotation
- Help establish, conform to, and audit our SLAs/SLOs so that the performance of our website exceeds the expectations of students, parents, and educators in even our largest and most demanding school districts
- Improve our deployment process to make it fast and predictable as possible
- With product engineering teams, debug production issues across services and levels of the stack
- Bring ops perspective to engineering, and engineering perspective to ops
- Partner with product engineering teams, to ensure the security, stability, performance, and cost-efficiency of Remind’s services
- Ensure infrastructure priorities are reflected in our engineering roadmap
- Maintain open source infrastructure projects, including:
- Competitive salary and equity
- 100% health coverage for you and your dependents
- Open vacation policy
- Paid parental leave
Remind is an equal opportunity employer, and we're committed to diversity and inclusion in the workplace. We aim to represent the students, teachers, and parents we serve, and we welcome, support, and empower all the diverse individuals in our community.