Site Reliability Engineer - Platform (Senior)
Brno, Brno-Město, CZ
System Administrators are the gatekeepers to the many systems that run our company and our clients. As a System Admin with IBM, you will have the opportunity to provide high-value IT services and leverage our leading-edge technology portfolio in our global network. Your work has a direct impact on the day-to-day productivity of our business by ensuring integrity of, and access to, our most important resource: data.
Your Role and Responsibilities
The role of the Site Reliability Engineer is to operate applications in production “mission critical systems”and do whatever is necessary to keep the site up and running. It is often defined as a software engineer doing operations work.
The SRE teams have the responsibility for maintaining and establishing service level indicators (SLIs), objectives (SLOs), agreements (SLAs), and error budgets for their systems and make sure these are met. They are expected to spend a certain amount of their time doing operational work (making sure systems work as expected) and also improving the systems they manage. SREs focus on writing software to automate processes and reduce toil. Toil is considered manual activities on a system, anything that is not currently automated.
Your Role and Responsibilities :
Your main responsibilities will consist of:
•Getting rid of toil through Automation
•Elimination of the problem, Reduction of technical debt
•Designing, analyzing, and troubleshooting large scale distributed systems
•Participation in on call rotation
•Engage with product teams to fix production outages and carry forward action items to improve ongoing reliability
•Develop effective tooling, alerts, and response to both identify and address reliability risks including automatic problem detection and mitigation
•Handle complex incidents and problems, and providing support and technical expertise to your colleagues and to the client
•Take part in designing solutions and in determining configuration and administration rules and parameters with the Service Line & Sales Managers
•Participate in the setup and migration of VMWare and Microsoft projects
•Setup of the operational dashboards and reporting
•Write technical documents, operational procedures and DR plans
•Contribute to the ITIL processes, including Incident, Problem, Availability, Capacity, Change and Configuration Management
•May draw up a capacity planning schedule and keep it up to date
•Perform technical life cycle, and reviews new solutions
You should have:
•Understanding of automation framework such as Ansible, are familiar with the TCP/IP stack, network routing and load balancing
•Approach troubleshooting systematically and have a deep sense of ownership for whatever you work on
•Ability to root cause sources of instability in a high traffic, distributed system
•Understanding of large scale complex systems from a reliability perspective
•Passion for resolving reliability issues and identify strategies to mitigate going forward
•Willingness to work in an ever changing environment
Technical Requirements :
•Windows servers, Clusters - 2012, 2016, 2019
•Windows Powershell Scripting
•Install, configure, update, and administer VMWare ESX 4.x/5.x/6.x virtualized servers
•Configure and manage ESX, VSPhere, VCenter, and VMotion for customer systems
•vCloud, vRealize, Site Recovery Manager
•Configure, manage, and optimize High Availability options, Distributed Resources Scheduler, Update Manager, and shared storage and datastores, RDM management
•Operate, monitor, and maintain IBM BladeCenter architecture, or other similar technologies (HP, DELL)
•Operate and manage backup/restore tools eg. VEEAM, TSM4VE
•Perform all virtual storage management and maintenance tasks
•Evaluating, developing, and implementing hardware and software solutions to ensure data and system integrity
•Review and deploy service packs, hot fixes, system updates, and vendor-supplied patches, driver and firmware updates
•Monitoring system capacity utilization, evaluating trends and planning future needs
•Performs root cause analysis on all VMware products including ESX hosts, Virtual Centers, and Virtual Machines.
•Administer, maintain and troubleshoot Storage Area Networks (SAN, vSAN) and Network Attached Storage (NAS) attached to VMware.
•Good oral and written communication skills
•Work well as both an individual contributor and as part of an integrated team
•Able to quickly understand complex technical problems and devise effective technical solutions
•Able to convey a strong presence, professional image, and deal confidently with competing priorities
•Dedicated to delivering strong, customer-focused performance in a mission-critical environment
Nice to Have:
Experience in XenDesktop and XenApp, Citrix Netscaler
As an IBM employee, you will be entitled to the following benefits:
- 5 weeks of paid vacation
- Elaborate education program for each employee - training during the work career, courses are lead by professional lectors; e-learning education; flexible education plan for each job position
- Strong career opportunities
- Access to Hi-tech; MAC@IBM
- Above standard Medical Care
- Discounts in Sports, Culture, Healthcare, Childcare, Finance, Electronics
- Global Travel and Life insurance
- Contribution to the Pension fund
- IBM stock purchase plan
You will have the opportunity to:
- Join our Succeeding@IBM Program - a structured on-boarding and development program
- Become part of our diverse and multinational community and collaborate within global and local teams
- Gain knowledge and develop skills through our world-class trainings
- Benefit from mentoring and coaching
- Balance your work with your life and enjoy a flexible working environment
Please, take in consideration, that should your application be successful, our recruiters will contact you.
Required Technical and Professional Expertise
Preferred Technical and Professional Expertise
About Business Unit
At Global Technology Services (GTS), we help our clients envision the future by offering end-to-end IT and technology support services, supported by an unmatched global delivery network. It's a unique blend of bold new ideas and client-first thinking. If you can restlessly reinvent yourself and solve problems in new ways, work on both technology and business projects, and ask, "What else is possible?" GTS is the place for you!
Your Life @ IBM
What matters to you when you’re looking for your next career challenge?
Maybe you want to get involved in work that really changes the world? What about somewhere with incredible and diverse career and development opportunities – where you can truly discover your passion? Are you looking for a culture of openness, collaboration and trust – where everyone has a voice? What about all of these? If so, then IBM could be your next career challenge. Join us, not to do something better, but to attempt things you never thought possible.
Impact. Inclusion. Infinite Experiences. Do your best work ever.
IBM’s greatest invention is the IBMer. We believe that progress is made through progressive thinking, progressive leadership, progressive policy and progressive action. IBMers believe that the application of intelligence, reason and science can improve business, society and the human condition. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 380,000 IBMers serving clients in 170 countries.
For additional information about location requirements, please discuss with the recruiter following submission of your application.
Being You @ IBM
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.