IaaS Monitoring DevOps Engineer

Dallas, Texas, US

IBM logo
IBM
Apply now Apply later

Posted 3 weeks ago


Introduction
Software Developers at IBM are the backbone of our strategic initiatives to design, code, test, and provide industry-leading solutions that make the world run today - planes and trains take off on time, bank transactions complete in the blink of an eye and the world remains safe because of the work our software developers do.  Whether you are working on projects internally or for a client, software development is critical to the success of IBM and our clients worldwide.  At IBM, you will use the latest software development tools, techniques and approaches and work with leading minds in the industry to build solutions you can be proud of.


Your Role and Responsibilities

How can we effectively design and deliver for a large scale, highly distributed cloud infrastructure? We are looking for an individual who will work on a team that will design and implement the back-end infrastructure that supports the IBM Cloud. The job provides the opportunity to be a key part of a team that will be delivering those networks, infrastructure, and services for a world-class Cloud.

Your Role and Responsibilities

  • Implement and administrate infrastructure and solutions that support the IBM Cloud.

  • Support the compliance and security integrity of the environment through your work

  • Partner with other teams, functional managers and program managers to deliver mission-critical services to the market

  • Support development of new and enhanced existing capabilities for our compute, storage and network services

  • Provide technical escalation support for other Infrastructure Operations teams

  • Design, implement, manage and create a reliable, highly performant monitoring and alerting framework with dashboards, analytics, and correlation across IaaS

  • Work with and adopt open source technologies as well as participate in new IBM innovations, not just around monitoring, alerting, dashboards and root cause analysis, but across IaaS

  • Work towards a more autonomous root cause analysis system which deduplicates alerts and provides for a comprehensive single pane of glass monitoring infrastructure

  • A self-driven attitude to propose, test and implement solutions and improvements for review and consideration with your peers

Additional Technical Requirements
  • 2+ years of experience with one or more Virtualization technologies: Citrix Xen Hypervisor (Preferred), KVM(also preferred), libvirt, qemu, VMware vSphere, etc.

  • 2+ years of experience with one or more automation and configuration management tools/solutions: Ansible(Preferred, Salt, Chef, python, bash, puppet, Rundeck, etc.

  • 2+ years of experience with version control systems: github(preferred), gitlab, subversion, etc.

  • Experience with one or more programming languages: PowerShell, Python, and Ruby

  • Practical experience with orchestration that uses desired state models and/or finite state machine models of orchestration: Kubernetes(Preferred), OpenShift, etc.

  • Practical experience Containerization and container orchestration: Docker(preferred) Kubernetes (preferred), OpenShift, rancher, docker swarm, docker compose

  • 2+ years of at least basic experience with databases, both RDBMs like mysql or postrgresql, as well as non-relational databases such as etcd, TimeScaleDB, InnoDB, etc. Not a DBA role.

  • Working knowledge with Network and Storage technologies

  • Working knowledge with ServiceNow, JIRA, Confluence, and GitHub

  • ITIL Foundation V4 certification is a plus





Required Technical and Professional Expertise
  • 5+ years of experience in data center infrastructure or relevant work experience

  • 5+ years of experience in large-scale infrastructure design, engineering, and support

  • 5+ years of experience in IT Change, Incident, Problem, Asset management

  • 5+ years of infrastructure engineering with proven record for delivering high-quality, large-scale solutions. Experience designing architectures for scale and performance

  • 5+ years of practical experience with one or more operating systems: Ubuntu (Preferred), CentOS, RHEL or Debian Linux, and Windows Servers.

  • 5+ years of experience debugging issues across a Linux environment with network, storage, compute and orchestration components. Does not need to be code debugging.

  • 2+ years of extensive experience with Monitoring technologies: Zabbix (preferred), Grafana, Nagios, Zenoss, ELK, Splunk, etc.




Preferred Technical and Professional Expertise
  • Excellent verbal and written communication skills

  • Highly responsible, motivated, able to work with little direction

  • Experience with design and development of complex systems

  • Ability to troubleshoot complex problems and customer issues

  • Working knowledge of Linux clustering, HA, and Fault Tolerant system implementations: active/active, active/passive, pacemaker, keepalived, haproxy, corosync, LVM

  • 2+ years of experience with complex systems and layered architecture models: OSI, Kubernetes, virtualization, TCP/IP, etc.

  • Working knowledge of what TCP/IP, BGP, Sockets, routing protocols, routes and keepalived are and how they participate in debugging and Highly available systems at scale.

  • Ability to debug an issue across the entire OSI stack of a typical Linux environment across storage, network, compute, OS, system tuning, orchestration.

  • Ability to debug stack traces to particular libraries in code and root cause identification.





About Business Unit
Digitization is accelerating the ongoing evolution of business, and clouds - public, private, and hybrid - enable companies to extend their existing infrastructure and integrate across systems. IBM Cloud provides the security, control, and visibility that our clients have come to expect. We are working to provide the right tools and environment to combine all of our client’s data, no matter where it resides, to respond to changing market dynamics.


Your Life @ IBM
What matters to you when you’re looking for your next career challenge?

Maybe you want to get involved in work that really changes the world? What about somewhere with incredible and diverse career and development opportunities – where you can truly discover your passion? Are you looking for a culture of openness, collaboration and trust – where everyone has a voice? What about all of these? If so, then IBM could be your next career challenge. Join us, not to do something better, but to attempt things you never thought possible.

Impact. Inclusion. Infinite Experiences. Do your best work ever.


About IBM
IBM’s greatest invention is the IBMer. We believe that progress is made through progressive thinking, progressive leadership, progressive policy and progressive action. IBMers believe that the application of intelligence, reason and science can improve business, society and the human condition. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 380,000 IBMers serving clients in 170 countries.


Location Statement
For additional information about location requirements, please discuss with the recruiter following submission of your application.

IBM intends this job to be performed entirely outside of Colorado.


Being You @ IBM
IBM is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, pregnancy, disability, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.












Job tags: Ansible Bash CentOS Chef Debian Docker ELK Gitlab Grafana Infrastructure design Jira Kubernetes Linux MySQL Open source Puppet Python Ruby Salt Ubuntu Virtualization VMware Windows
Job region(s): North America
Share this job: