Site Reliability Engineering Manager (IN)
Bengaluru, Karnataka, India
As part of this job, you will be responsible for managing and growing an engineering team that is responsible for production reliability. Design operations as code, work with the product engineering to improve reliability, implement actionable monitoring framework and design 24/7 process for reacting to critical incidents.
- Manage and grow the global SRE engineering team.
- SRE management and prioritization for multiple Fortanix products.
- Own production upgrades, migrations, disaster recovery drills, backup/restore, securing cloud environments, logging, log analytics etc
- Work with DevOps, Networking, Customer Success, and Development to continuously improve the production environment.
- Design metrics to measure quality improvements in production and work with Customer Success to define SLA/SLO/SLI.
- Own service status and incidence reporting portal.
- Improving the on-call incident response for critical issues
- Responding/communicating to impacted customers and providing root-cause-analysis/action plan.
- Design tests to simulate scenarios/events before they occur.
- Manage IAM to production system and implement auditability of access.
Experience with modern enterprise Site reliability engineering. Along with experience in the following areas
- Advanced experience of managing software deployment on Cloud via pipelines (example: bitbucket/Gitlab) and Datacentre.
- Understanding DevOps practices on how modern software is deployed, upgraded and monitored.
- Experience with both managed (AKS, EKS, GKE.) and unmanaged (on-prem) Kubernetes. Especially production experiences with Kubernetes and Docker.
- Advanced experience with Linux administration and automation.
- Experience with high-level network infrastructure for Datacentre and Cloud.
- Understanding security aspects of an internet-exposed SAAS service.
- Bachelors/Masters in Computer Science, Engineering or a related field.
- Engineering: 12+ Years of engineering experience with 3+ Years of management experience with focus in Site reliability engineering.
- Solid understanding of Cloud technologies.
- Demonstrated ability to coordinate cross-functional work teams toward completion.
- Demonstrated multitasking, effective leadership and analytical skills.
- Advanced written and verbal communication skills is a must.
- Must be a team player.
- Medical insurance
- Friendly culture that brings the best out of everybody
Explore more DevOps, Cloud and SRE career opportunities
- Open Cloud Infrastructure Architect Jobs
- Open Staff, Product Manager - Global Infrastructure Jobs
- Open IT DevOps Engineer Jobs
- Open Manager of DevOps & Engineering Infrastructure Jobs
- Open Senior Automation Engineer Jobs
- Open Data Platform Engineer Jobs
- Open Site Reliability Engineer II Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Principal Cloud Architect Jobs
- Open Staff DevOps Engineer Jobs
- Open Reliability Engineer Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Sr Software engineer (Infrastructure) Jobs
- Open DevOps Engineer - Raleigh Hub Jobs
- Open Senior Security Automation Engineer Jobs
- Open Software Development Engineer, AWS Security Jobs
- Open QA Automation Engineer - Workforce Engagement Management Jobs
- Open Senior Software Development Engineer, AWS Security Jobs
- Open Senior Devops Engineer Jobs
- Open Cloud DevOps Systems Engineer Jobs
- Open Senior Cloud Architect Jobs
- Open Solutions Architect - VMware Specialist Jobs
- Open MySQL-related jobs
- Open REST-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open Elasticsearch-related jobs
- Open Virtualization-related jobs
- Open High availability-related jobs
- Open Golang-related jobs
- Open Reliability engineering-related jobs
- Open EC2-related jobs
- Open VMware-related jobs
- Open Redis-related jobs
- Open JS-related jobs
- Open MongoDB-related jobs
- Open Node-related jobs
- Open Grafana-related jobs
- Open Gitlab-related jobs
- Open PostgreSQL-related jobs
- Open Jenkins-related jobs
- Open Perl-related jobs
- Open Web applications-related jobs
- Open Spark-related jobs
- Open Load Balancing-related jobs