Cloud Infrastructure Services Site Reliability Engineer (SRE)
Auckland, Auckland, NZ
IBM welcomes applications from people of all backgrounds. If you require accommodations or adjustments, we encourage you to make us aware so that we can support you through the application process.
Your Role and Responsibilities
Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, distributed systems. SRE is a key role in our banking customer’s growing and dynamic Cloud platform. This technical role is focused on deploying, maintaining, and automating a wide range of operational tasks for IBM’s Cloud offering at the bank. You will work collaboratively with the IBM Cloud team, the customer and associated vendors to support, maintain, and operationally improve the reliability of the platform.
This is your chance to be an integral part of a dynamic team of talented professionals deploying and maintaining innovative, industry-leading, cloud-based infrastructure as a Service (IaaS) solutions.
What will I be doing?
- You are a Strategic thinker and will help to drive the bank's journey through the current Hybrid platform they use to transitioning to an Open platform such as Red Hat’s Openshift / Openstack platform while helping to drive stability and resilience.
- You will work with the IBM Cloud service team at the bank and drive continuous improvement.
- You will be responsible for the Operational Stability and Performance of one or more Critical Business Services.
- You should have a strong desire to work within a CI/CD environment and have a passion for embracing new cloud technologies and working with our customer to ensure they are successful.
- You need to be collaborative, able to handle responsibility, and love learning new techniques and tools
- With quality, robustness and security in mind you will drive and implement new tools to facilitate operations, improve reliability, gather insights into our platform, and make changes to resolve or mitigate common operational issues.
- You will be working to improve automation of common processes and finding enhancements and innovative solutions to help the services both scale and become increasingly self-healing.
What skills and experience will I need?
- Proven experience to debug, optimize code, and automate routine tasks
- Experience in Systems Engineering, such as Linux I/O tuning, performance, memory management and troubleshooting; and solid Linux knowledge, particularly with RHEL
- Experience with troubleshooting issues in production systems and network issues
- Experienced in working with containerised workloads and management platforms like Docker or Kubernetes.
- Knowledge in Event Streaming technologies such as Apache Kafka.
- Experience with database internals and diagnosing memory leaks, data corruption, replication, database performance and tuning.
- Previous experience working with public cloud platforms like IBM Cloud, AWS, Azure or others.
- Experience with Github, CI/CD pipeline (Jenkins, Ghenkins, Tekton, etc.), IBM Cloud, UI and/or CLI and IBM Cloud stack (I&AM, CF, ALB)
- Familiarity with API gateways and Micro Services, including restful APIs, etc
What skills or experience may be helpful
- There is no requirement to be an expert in any one language; however, knowledge of Go, Python, Jenkins, Kubernetes, Chef, Satellite, Ansible, Yaml, OpenStack and OpenShift are useful.
- Knowledge in operating highly available databases such as DB2, Oracle, SQL Server, Elasticsearch, etc production environments and/or streaming technologies such as Apache Kafka would also be useful.
Required Technical and Professional Expertise
Preferred Technical and Professional Expertise
About Business Unit
At Global Technology Services (GTS), we help our clients envision the future by offering end-to-end IT and technology support services, supported by an unmatched global delivery network. It's a unique blend of bold new ideas and client-first thinking. If you can restlessly reinvent yourself and solve problems in new ways, work on both technology and business projects, and ask, "What else is possible?" GTS is the place for you!
Your Life @ IBM
What matters to you when you’re looking for your next career challenge?
Maybe you want to get involved in work that really changes the world? What about somewhere with incredible and diverse career and development opportunities – where you can truly discover your passion? Are you looking for a culture of openness, collaboration and trust – where everyone has a voice? What about all of these? If so, then IBM could be your next career challenge. Join us, not to do something better, but to attempt things you never thought possible.
Impact. Inclusion. Infinite Experiences. Do your best work ever.
IBM’s greatest invention is the IBMer. We believe that progress is made through progressive thinking, progressive leadership, progressive policy and progressive action. IBMers believe that the application of intelligence, reason and science can improve business, society and the human condition. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 380,000 IBMers serving clients in 170 countries.
For additional information about location requirements, please discuss with the recruiter following submission of your application.
Being You @ IBM
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, pregnancy, disability, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Job tags: Ansible Apache AWS Azure CD Chef CI Docker Elasticsearch Go Kafka Kubernetes Linux OpenStack Oracle Python Reliability engineering SQL Streaming
Job region(s): Asia/Pacific