Staff Engineering Manager, Site Reliability Engineering
Remote - Asia Pacific
GitHubGitHub is where over 65 million developers shape the future of software, together. Contribute to the open source community, manage your Git repositories, review code like a pro, track bugs and features, power your CI/CD and DevOps workflows,...
This is part of a larger global SRE team that works to support, automate and improve the infrastructure that underpins Github’s new managed Github SAAS offering. You’ll be leading the team responsible for operations - the day to day monitoring, administration and operations necessary to run Github in a cloud environment at scale - and reliability engineering - the work to automate, iterate and improve the systems that make this SAAS offering work.
You’ll help grow our culture of inclusion, collaboration and togetherness. We are a remote team and work with teammates across the world and in many different time zones. Mental health first is one of our team mantras. In a Covid world we want to build as healthy and sustainable a team as we can through empathy, flexible schedules and empowerment. Within those confines we are passionate about making a team that is scalable through shared knowledge and well defined best practices.
Our platform is helping deliver a cloud native experience to Github’s new SAAS offering. Docker, Kubernetes, infrastructure as code, managed cloud services and distributed security is where most of our time will be spent. In addition to managing the SRE team, having experience operating managed public services and knowing how to think about monitoring, automation, day to day administration and incident management will be key. Familiarity with paired programming, TDD, kanban and a desire to ship many times a day will fit well with our way of being.
- Manage a remote team of Site Reliability Engineers, providing regular feedback and ensure career growth and progression, coordinating work, building relationships, and identifying opportunities and areas of improvement and innovation.
- Working with the team to explore how we can solve them, often via real-time conversations in Slack or Zoom, with asynchronous communication in GitHub Issues, PRs, Discussions, and Projects.
- Ensure our service runs with a high level of reliability, performance, and security.
- Cultivate an environment where team members are empowered through a collective sense of ownership and belonging.
- Collaborate with engineering and support teams, product management and engineering leadership to define and prioritize projects that help us meet our objectives.
- Participate in hiring and sourcing to build a diverse, high performance engineering team.
- Encourage an environment of technical excellence, and facilitate architectural discussions and decision making.
- 3 or more years of experience as a manager of software engineers
- Prior experience working in distributed or remote teams
- Passionate about fostering good engineering practices, tools, and processes
- Excellent verbal and written communication skills
- Highly developed organizational, people, and process skills
- A track record of providing feedback, teaching and mentoring others, and learning new skills
- Experience developing a strategy and roadmap for your teams
- Experience with at least one cloud platform, such as Azure
- Comfort with the GNU/Linux operating system, particularly Ubuntu
- Experience with Docker
- Familiarity with Kubernetes
- Experience with scripting and automation
- Experience building infrastructure and automation
- Experience with monitoring, alerting and operations
- Experience with distributed systems with high availability requirements
- Experience with incident response
- Familiarity with Git and GitHub
- Experience with Azure (AAD, Security, ARM, AKS)
- Experience with Kubernetes
- Comfort with at least one modern programming language, such as Golang
- Experience operating services at scale
- Experience with highly available systems at scale
- Experience building and deploying software in a SAAS environment
- Experience with incident command/management
Who We Are:
GitHub is the developer company. We make it easier for developers to be developers: to work together, to solve challenging problems, and to create the world’s most important technologies. We foster a collaborative community that can come together—as individuals and in teams—to create the future of software and make a difference in the world.
Customer Obsessed - Trust by Default - Ship to Learn - Own the Outcome - Growth Mindset - Global Product, Global Team - Anything is Possible - Practice Kindness
Why You Should Join:
At GitHub, we constantly strive to create an environment that allows our employees (Hubbers) to do the best work of their lives. We've designed one of the coolest workspaces in San Francisco (HQ), where many Hubbers work, snack, and create daily. The rest of our Hubbers work remotely around the globe. Check out an updated list of where we can hire here: https://github.com/about/careers/remote
We are also committed to keeping Hubbers healthy, motivated, focused and creative. We've designed our top-notch benefits program with these goals in mind. In a nutshell, we've built a place where we truly love working, we think you will too.
GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!
Please note that benefits vary by country. If you have any questions, please don't hesitate to ask your Talent Partner.
Explore more DevOps, Cloud and SRE career opportunities
- Open Manager of DevOps & Engineering Infrastructure jobs
- Open Linux Infrastructure Developer jobs
- Open Lead Site Reliability Engineer jobs
- Open Lead DevOps Engineer jobs
- Open Senior Software Engineer - Site Reliability jobs
- Open Staff Platform Engineer jobs
- Open Senior Test Automation Engineer jobs
- Open Devops Engineer jobs
- Open Senior Infrastructure Security Engineer jobs
- Open Principal Site Reliability Engineer jobs
- Open Senior Software Engineer, DevOps jobs
- Open Senior DevOps Engineer - Pleasanton Hub jobs
- Open Senior Software Engineer DevOps (remote) jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub jobs
- Open Cloud Security Architect jobs
- Open DevOps Manager - Boston Hub jobs
- Open Senior Software Engineer - Site Reliability - Raleigh Hub jobs
- Open Data Platform Engineer jobs
- Open Reliability Engineer jobs
- Open DevOps Manager - Pleasanton Hub jobs
- Open Senior DevOps Engineer - Boston Hub jobs
- Open Senior Software Engineer - Site Reliability - Boston Hub jobs
- Open Senior Cloud Security Engineer jobs
- Open Database Administrator jobs
- Open Senior Cloud Infrastructure Engineer jobs
- Open Kafka-related jobs
- Open REST-related jobs
- Open Unix-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open Elasticsearch-related jobs
- Open S3-related jobs
- Open Golang-related jobs
- Open Jira-related jobs
- Open PowerShell-related jobs
- Open TCP-related jobs
- Open High availability-related jobs
- Open JS-related jobs
- Open Virtualization-related jobs
- Open VMware-related jobs
- Open Grafana-related jobs
- Open EC2-related jobs
- Open Redis-related jobs
- Open Node-related jobs
- Open TCP/IP-related jobs
- Open MongoDB-related jobs
- Open PostgreSQL-related jobs
- Open Gitlab-related jobs
- Open NoSQL-related jobs