Director of Site Reliability Engineering (SRE) - Remote (US based position)

Remote - Baltimore, Maryland, United States

Full Time Executive-level / Director
Olive logo
Olive is the only AI as a Service built specifically for healthcare. She’s always-on, improving operational efficiency through automation and delivering quick and accelerated ROI.
Apply now Apply later

Olive’s AI workforce is built to fix our broken healthcare system by addressing healthcare’s most burdensome issues -- delivering hospitals and health systems increased revenue, reduced costs, and increased capacity. People feel lost in the system today and healthcare employees are essentially working in the dark due to outdated technology that creates a lack of shared knowledge and siloed data. Olive is designed to drive connections, shining a new light on the broken healthcare processes that stand between providers and patient care. She uses AI to reveal life-changing insights that make healthcare more efficient, affordable and effective. Olive’s vision is to unleash a trillion dollars of hidden potential within healthcare by connecting its disconnected systems. Olive is improving healthcare operations today, so everyone can benefit from a healthier industry tomorrow.

Our Olive Enables team is looking to add a Director to build the SRE capabilities and services/systems for our internal engineering teams. We are looking for an engineering leader to help scale our SRE teams to be able to support our growth and build a foundation for Olive AI’s future products. As an exceptional engineering leader, you are growth oriented and have a proven track record of building highly effective geo-distributed engineering teams who make the impossible possible. You will need to be a strategic thinker, frame & find solutions to complex engineering and people challenges, measure & drive results, and be able to move fast and maintain focus on the highest impact initiatives while keeping track of trade-offs made along the way.


  • Strong collaboration, team first mentality, prioritization, and adaptability skills required
  • Solid understanding of cloud technologies (AWS/Azure) & architecture patterns (serverless, microservice, event-driven)
  • Collaborates with the Director of Security to ensure Olive AI embraces and apply security strategy
  • Works in a consultative fashion with other department heads, such as product, marketing and customer success as an advisor of technologies and product roadmap that may improve their efficiency and effectiveness.
  • Experience providing operational KPI metrics & reports to senior management
  • Drive accountability for quality aspects in release, system performance, platform availability, operational efficiency, and risk management.
  • Assess and provide feedback on architectural designs for operational acceptance.
  • Proven leadership of SRE teams and company wide initiatives
  • Lead 24x7 production support for the environment to quickly restore service during impacts.
  • Deep understanding of Site Reliability Engineering (SRE) philosophy, Chaos Engineering, technologies, platforms and tools, SLA management, incident resolution, and automation
  • Define and own the SRE OKRs & roadmap and conduct quarterly roadmap reviews. In addition, you should be able to champion and partner with our engineering leaders to set and maintain effective SLOs across the org.
  • Work closely with our product & engineering leadership to set standards around patterns, frameworks, technologies, and processes to promote a simple and consistent approach across multiple types of services.


  • 3+ years managing and working with a team of senior software engineers/architects
  • 5+ years' experience in Developing monitoring tools and log analysis tools to manage operations managing and/or influencing infrastructure services to ensure application service uptime and user experience Developing and managing operations leveraging key event streaming, messaging and DB services such as Cassandra, MQ/JMS/Kafka, Aurora, RDS, Cloud SQL, BigTable, DynamoDB, MongoDB, Cloud Spanner, Kinesis, Cloud Pub/Sub, etc
  • 5+ years of experience with a focus on strategic technology planning, and execution
  • 4+ years of hands-on experience designing and building highly distributed, cloud-based systems
  • 3+ years of experience as a Site Reliability Engineer or Leading a Site Reliability Engineering team
  • A hybrid cloud environment such as Azure, and AWS
  • Experience working closely with senior executives to execute effectively on strategic plans and track and report progress
  • Experience in the Healthcare IT industry
  • Proven strategic and tactical thinker, able and comfortable translating business strategy into architectures / plans and equally comfortable rolling-up-your sleeves to problem solve with your team
  • Ability to multi-task and effectively manage multiple projects, competing priorities and adeptly re-prioritize based on changing needs
  • Must be flexible and adapt well to change

At Olive, we're committed to growing and empowering an inclusive community within our company and industry. This is why we hire and cultivate diverse teams of the best and brightest from all backgrounds, experiences, and perspectives across our organization. Research shows that oftentimes women and other minority groups only apply to open roles if they meet 100% of the listed criteria. Olive encourages everyone — including women, people of color, individuals with disabilities and those in the LGBTQIA+ community — to apply for our available positions, even if they don't necessarily check every box on the job description.



This job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee. Duties, responsibilities and activities may change or new ones may be assigned.

This job description does not constitute a contract of employment and Olive AI, Inc. may exercise its employment-at-will rights at any time.


We take the health and happiness of our employees seriously and consistently evaluate new ways to provide an amazing place to work. From retirement planning, to a wellness program designed to actively incorporate mental and physical wellness into daily interactions amongst fellow Olivians, we make sure to take care of our own.

  • Health, Dental, and Vision insurance that starts on your first day at Olive with 100% of premiums covered for team members and 75% covered for dependents
  • Monthly Grid stipend to cover work related expenses
  • Unlimited PTO
  • Telemedicine
  • EAP/Mental health resources
  • Getaways by Marriott Bonvoy
  • Family-building and fertility support via Kindbody
  • 12 weeks of parental leave
  • 401(K) match
  • Wellness program
  • Stock Options

Job region(s): Remote/Anywhere North America
Job stats:  3  0  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities