Site Reliability Engineer, Managed Hosting
Remote - Canberra, Australia
GitHub helps companies and organizations succeed by allowing them to build better software through powerful, flexible CI/CD and automation, directly in the developer workflow.
GitHub’s Field Architecture team is looking for systems and software engineering professionals to join a new Operations and Reliability team. This team will be working to support, automate and improve the infrastructure that underpins one of GitHub’s new managed Github SAAS offering. This role is an opportunity to learn and grow on multiple fronts.
As an operations engineer you’ll be responsible for deployment and the day-to-day monitoring, administration and operations necessary to run GitHub in a cloud environment at scale. As a reliability engineer you’ll work to automate, iterate and improve the systems that make this SAAS offering work.
Most importantly you’ll help grow our culture of inclusion, collaboration and togetherness. We are a remote team and work with teammates across the world and in many different time zones. Mental health first is one of our team mantras. In a Covid world we want to build as healthy and sustainable a team as we can through empathy, flexible schedules and empowerment. Within those confines we are passionate about making a team that is scalable through shared knowledge and well defined best practices.
Our platform is helping deliver a cloud native experience to GitHub’s new SAAS offering. Infrastructure as code, managed cloud services, configurations and distributed security is where most of our time will be spent. Experience operating managed public services and knowing how to think about monitoring, automation, day-to-day administration and incident management will be key.
This role at GitHub is an opportunity to blend your system design, empathy, and software engineering skills on an ever-changing set of novel reliability and operations challenges. Join us on this journey and have a meaningful impact on how the world builds software.
The Job - what you’ll be doing:
- GitHub’s SAAS deployments, configuration, and infrastructure updates
- Ensure smooth day-to-day operations of GitHub’s SAAS offering for our customers
- Automate away as much of the day to day as possible - “Run By Robots” is the goal.
- Drive organization wide best practices for monitoring, alerting and incident management.
- Identify, respond to, and collaborate with support and product teams to resolve production and customer issues and incidents.
- Be able to professionally interact with customers when dealing with one-off configurations, network troubleshooting or incident reporting;
- Help drive new features, abilities and code changes into the core product with an operations focused point of view.
- Experience with Git and GitHub
- Experience with at least one cloud platform, such as Azure
- Comfort with the GNU/Linux operating system, particularly Ubuntu
- Familiarity with Docker and Kubernetes
- Experience with scripting and automation, particularly bash
- Experience building infrastructure and automation
- Experience with monitoring, alerting and operations
- Experience with distributed systems with high availability requirements
- Experience with incident response
- Experience with Azure (AAD, Network Security, Virtual Networks, Load Balancing)
- Experience operating highly available systems
- Experience building and deploying software in a SAAS environment
- Experience as incident commander
It’s not expected that you have expertise in all these areas - we’re looking for someone who is particularly strong in a few areas and has some interest and capabilities in others.
Ability to meet GitHub, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft, GitHub’s parent company, Cloud Screen upon hire/transfer and every two years thereafter.
Who We Are:
GitHub is the developer company. We make it easier for developers to be developers: to work together, to solve challenging problems, and to create the world’s most important technologies. We foster a collaborative community that can come together—as individuals and in teams—to create the future of software and make a difference in the world.
Customer Obsessed - Trust by Default - Ship to Learn - Own the Outcome - Growth Mindset - Global Product, Global Team - Anything is Possible - Practice Kindness
Why You Should Join:
At GitHub, we constantly strive to create an environment that allows our employees (Hubbers) to do the best work of their lives. We've designed one of the coolest workspaces in San Francisco (HQ), where many Hubbers work, snack, and create daily. The rest of our Hubbers work remotely around the globe. Check out an updated list of where we can hire here: https://github.com/about/careers/remote
We are also committed to keeping Hubbers healthy, motivated, focused and creative. We've designed our top-notch benefits program with these goals in mind. In a nutshell, we've built a place where we truly love working, we think you will too.
GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!
Please note that benefits vary by country. If you have any questions, please don't hesitate to ask your Talent Partner.
Explore more DevOps, Cloud and SRE career opportunities
- Open Automation Engineer Jobs
- Open Linux Infrastructure Developer Jobs
- Open Reliability Engineer Jobs
- Open Data Platform Engineer Jobs
- Open Sr. DevOps Engineer Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Senior Infrastructure Security Engineer Jobs
- Open Devops Engineer Jobs
- Open Senior Test Automation Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Lead Site Reliability Engineer Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Senior Automation Engineer Jobs
- Open Site Reliability Engineer II Jobs
- Open Senior DevOps Engineer - Boston Hub Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Staff DevOps Engineer Jobs
- Open Principal Cloud Architect Jobs
- Open Senior Cloud Infrastructure Engineer Jobs
- Open Senior DevOps Engineer - New York Hub Jobs
- Open DevOps Engineer II Jobs
- Open Senior Software Engineer - Site Reliability - Raleigh Hub Jobs
- Open Senior Software Engineer - Site Reliability - Boston Hub Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open DevOps Manager - Boston Hub Jobs
- Open Kafka-related jobs
- Open REST-related jobs
- Open Unix-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open Elasticsearch-related jobs
- Open S3-related jobs
- Open PowerShell-related jobs
- Open Jira-related jobs
- Open Golang-related jobs
- Open High availability-related jobs
- Open Virtualization-related jobs
- Open TCP-related jobs
- Open VMware-related jobs
- Open JS-related jobs
- Open EC2-related jobs
- Open Redis-related jobs
- Open Node-related jobs
- Open TCP/IP-related jobs
- Open Grafana-related jobs
- Open MongoDB-related jobs
- Open PostgreSQL-related jobs
- Open Gitlab-related jobs
- Open NoSQL-related jobs