Site Reliability Engineering Manager, Storage
Redwood City, CA
What is Box Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, collaboration and workflow. We have an amazing opportunity to further establish ourselves as leaders in the space, and we need strong advocates to help us achieve that goal. By joining Box, you will have the unique opportunity to help capture a majority of this developing market and define what content management looks like for the digital enterprise. Today, Box powers over 97,000 businesses, including 70% of the Fortune 500 who trust Box to manage their content in the cloud. Why Box needs you We're a diverse mix of backgrounds, skillsets and each one of us makes a big difference every day. We need someone who knows how to run systems at scale and is passionate about using software to manage them and put the E in SRE. In particular we're looking for a principled leader who can grow, build, and develop a world class SRE team to support box's mission. A person with equal parts EQ and IQ, and a desire to continually learn in both domains. As the leader for our Storage SRE team, you and your team will support some of Box's most critical services at a massive scale. We are looking for an experienced engineering leader to realize our vision of scalable, reliable and fault tolerant storage systems. Someone that combines technical breadth and depth with strong management chops to successfully define and deliver on the roadmap. What you'll do
- You will be leading a team of engineers to design, write, and ship software and that manages large distributed storage systems in a production environment with high uptime requirements
- You'll be evolving our operations practices, improving efficiency, and ensuring our services are resilient and scalable to meet high uptime requirements
- You will be communicating with all levels of the organization, from engineers to vice presidents
- Partnering closely with development teams managing the upload, encryption layer, policies engines services written in house
- Working to transition services into our public cloud platforms as part of our strategic roadmap
- You're an experienced hands-on leader who's passionate about making other people successful
- You're at home command-lining around and troubleshooting writing automation on a Linux system
- You are passionate about solving infrastructure problems with software
- You have managed an Engineering team for at least 5+ years with demonstrated growth.
- You are experienced with some of the following data technologies: Linux, Kubernetes, scala or other JVM based languages, and hbase.
- Comfortable working with Linux, and Raid arrays for legacy storage systems, and working extensively in Python.
- You have experience with common DevOps/SRE configuration, alerting, and observability tools and processes.
- Experience in one of the public cloud providers (AWS, GCP or Azure) is a big plus
- You have a bachelor's degree in computer science or equivalent
- Visit this webpage to check out all of our exciting benefits: https://join.collectivehealth.com/box
- For all other benefits, please check out: Box Benefits + Perks
Job tags: AWS Azure GCP Kubernetes Linux Python Reliability engineering Scala