Tech Lead, Site Reliability Engineering

San Mateo, CA

Guidewire Software logo
Guidewire Software
Apply now Apply later

Posted 3 weeks ago

We are looking for an experienced Tech Lead, Site Reliability Engineering to join our Cloud Data Platform team. Site reliability is a key responsibility and a highly visible hands-on role responsible for end to end ownership of application and platform availability, performance and scalability. SRE is a new team within Cloud Data Platform. This is an opportunity to define, build, evangelize, and optimize our SRE practices! 

Roles and Responsibilities

  • Build and lead a diverse team of world-class engineers
  • Maintain a 24x7 production environment with a high level of service availability.
  • Drive incidents to resolution by coordinating with multiple engineering teams
  • Partner with other development teams in defining and implementing improvements in service architecture.
  • Understanding the near, mid, and long-term needs of the business and how the work of the team contributes
  • Implement automation and orchestration for manual processes required to operate and deploy cloud services
  • Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks
  • Form and maintain relationships with internal and external partners
  • Develop deeper insights and analysis into the quality of experience for our customers
  • Build a positive work environment based on accountability, in collaboration with the engineering and operations management teams across the Guidewire.

About You

  • BS/MS in Computer Science, Computer Engineering, Maths or equivalent work experience
  • 8+ years of relevant work experience
  • 2+ years of experience managing a team of engineers
  • You have led an Infrastructure or SRE team in a production operations context
  • You have experience solving infrastructure problems with software
  • You have a big-picture perspective on systems and tools
  • You can collaborate with other engineering teams to understand their systems and help improve them
  • You have strong technical knowledge of cloud infrastructure, distributed systems, networking, storage, operating systems
  • Experience with Java or Scala development experience, e.g., Java based Micro-services, REST APIs/gRPC
  • Experience with AWS and its native services, e.g. RDS, EMR, Redshift, MSK, SNS, SQS
  • Understand IaaS abstractions like Kubernetes/Cloud foundry/OpenShift
  • Agile development methodologies
  • Secure coding practices are a plus
About Guidewire
Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently.
Guidewire combines core, data, digital, analytics, and AI to deliver our platform as a cloud service. 380 insurers, including the largest and most complex in the world, run on Guidewire.
As a partner to our customers, we continually evolve to enable their success. We are proud of our unparalleled implementation track record with 700+ successful projects, supported by the largest R&D team and partner ecosystem in the industry. Our marketplace provides hundreds of add-ons that accelerate integration, localization, and innovation.
Job tags: AWS C Java Kubernetes Redshift Reliability engineering REST Scala