SRE Manager - Client Reliability

New York City - NY, Remote

Hyperscience logo
Apply now Apply later

Posted 1 month ago

Company DescriptionHyperscience is a technology company blazing a new path in enterprise automation with a reimagined approach to building and powering processes. The Hyperscience Platform is the world's first Software-Defined, Input-to-Outcome Automation platform used by top public companies and government organizations around the world to build and run mission-critical processes with ease and speed.
Hyperscience helps enterprises quickly build and roll out new business processes with built-in automations, reduce manual errors, increase high- and low-skilled employee productivity, and eliminate the need for costly transformation. Hyperscience’s Intelligent Document Processing solution has been implemented at some of the world's leading financial services, insurance, healthcare and government organizations, including TD Ameritrade, QBE Insurance Group Limited and Voya Financial, helping them lower costs, reduce error rates by 67% and increase employee capacity by 10x.
Since its founding in 2014, Hyperscience has grown to more than 150 employees with offices in New York City, Sofia, Bulgaria, and London, UK, and has consistently been recognized as one of the best places to work, with a collaborative and innovative culture and best-in-class benefits.
Job DescriptionAs the Site Reliability Engineering Manager, you will lead capabilities that are critical to Hyperscience’s success and the success of our clients.
Hyperscience serves our customers on-premise (including client-managed private clouds) and is currently building out our Cloud offering. You will lead the team defining the technology stack, tooling, automation, standards and practices for the SRE capability focused on our clients’ operations and infrastructure. You will partner with our CloudOps team on our SaaS and general Cloud operations, including alerting, monitoring, incident management, and definition and adherence to effective SLAs. You will partner with our application engineering teams to ensure alignment with the Product roadmap and technical roadmap.
For our on-premise customers, you will define standards, practices, and processes for on-premise incident management and escalation, including Tier 2 and Tier 3 support, as well as an SRE roadmap for tooling and automation to make on-premise operations and serviceability simple for our customers. You will partner with the Customer Success (CX) team to ensure effective collaboration in support of our clients.
As with Google’s definition of SRE, our focus is on automation. This means that strong software development and scripting practices and capabilities are central to the team.
You will also be responsible for the development and growth of the SRE team members. You will serve as a strong mentor for your team. You will also partner with them to guide their career development and personal growth.
This is an exciting time for Hyperscience’s product and business. You will have the opportunity to influence and deliver on a bold vision for transforming the way organizations model and execute their business processes, and there will be many opportunities for growth along the way.
LocationIdeally this position is in New York City or Toronto, but working remotely elsewhere on the US East Coast is an option.


  • Drive SRE strategy, with a strong focus on standards, practices, processes, automation, and measurement, including definition of SLIs, SLOs, and SLAs.
  • Partner with Customer Success and Product Engineering teams  to define and evolve Tier 2 and Tier 3 processes and practices.
  • Partner with Product Engineering to improve aspects of product serviceability and operations, including on-premise deployment, operations, observability, and incident management.
  • Partner with CloudOps on cloud infrastructure operations and incident management standards and practices.
  • Lead growth of the SRE area, including improvements to our hiring practices, onboarding, mentorship and career development practices.
  • Drive career development, personal, and technical growth of all team members.
  • Drive organizational improvements both within your area and across all of Product Engineering.
  • Oversee and manage vendors, vendor contracts and appropriate SLAs within the SRE area.


  • 3+ years SRE management experience.
  • Experience defining and managing a roadmap for an SRE team that drives change across an organization, including collaborating with Product and other engineering teams and stakeholders on the roadmap.
  • Strong experience with SRE tech stack, including tooling and automation strategy and implementation to improve incident response effectiveness across Engineering and CX.
  • Experience technically overseeing a team in developing tooling and automation, including setting technical direction and reviewing implementation.
  • Deep experience defining and implementing standards, practices, and processes, including driving adoption and change management.
  • Experience in operations, including automation, monitoring, alerting, and incident management.
  • Experience establishing and evolving Tier 2 and Tier 3 support processes for on-premise and Cloud.
  • 2+ years of experience with AWS infrastructure, including strong architecture, infrastructure automation, and operational experience.
  • Experience with both SaaS products and on-premise delivery.
  • Experience building highly collaborative teams with strong bottom-up ownership.
  • Excellent verbal and written communication skills.
  • Experience with GCP and Azure a plus

Benefits- Top notch healthcare for you and your family- 30 days of paid leave annually to help nurture work-life symbiosis- A 100% 401(k) match for up to 6% of your annual salary- Stock Options- Wellness stipend- Pre-tax transportation and commuter benefits6-month parental leave (or double salary to pay for your partner's unpaid leave)- Free travel for any person accompanying a breastfeeding mother and her baby on a business trip- A dependent care stipend up to $3,000 per month, per child, under the age of 21 for a maximum of $6,000 per month total- Daily catered lunch, snacks, and drinks- Budget to attend conferences, train, and further your education- $1,000 one-time-use WFH stipend and $75 monthly WFH stipend- Relocation assistance
We are an equal opportunity employer. We welcome people of different backgrounds, experiences, abilities and perspectives. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status.For Sofia/UK roles: All job applications will be treated and processed with strict confidentiality and in full compliance with the GDPR provisions. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Job tags: AWS Azure GCP Reliability engineering
Job region(s): North America Remote/Anywhere
Share this job: