Sr. Manager, Cloud Infrastructure & Automation

New York, New York, United States

Full Time
Take-Two Interactive Software, Inc. logo
Take-Two Interactive Software, Inc.
Apply now Apply later

Posted 1 month ago

* Develop, mentor, and oversee the operations of our infrastructure team managing cloud technologies and automation development
* Provide technical leadership to the teams to
    * Design, develop and manage services deployment pipelines in collaboration with services development teams
    * Design tools and processes to ensure high availability/reliability of various applications
    * Develop monitoring and alerting solution to track critical service operations metrics and report deviations
    * Design and tune service deployment parameters for optimal performance
    * Track capacity and resource consumption, forecast capacity requirements and cost
* Proactively identifies potential technical challenges, and ensures that the team makes solid, pragmatic technical decisions
* Build a team culture to aim for high service availability, scalability and observability goals
* Stay keenly aware of engineering processes and tooling. Actively seek ways to improve them
* Work with other engineering teams on automation initiatives, decisions and troubleshooting
* Define and report on metrics relating to SLAs and uptime

* 15+ years working in the software industry with at least the last 7+ years' experience in building and managing teams / shipping enterprise software through multiple releases
* 15+ years working with Linux (Debian/RHEL based).  Extensive knowledge of systems mgmt best practices and fundamentals
* Deep working experience on VMware virtualization platforms or others like Amazon Web Services, Google Cloud etc.
* 8+ years of experience in managing production-critical infrastructures and DevOps environments
* 5+ years of work experience in Site Reliability/Infrastructure Engineering for a team operating distributed systems/cloud infrastructure
* Kubernetes / Mesos deployment and management experience - ECS, EKR and/or KOPS deployments
* Is a strong self-starter, operationally-focused, has a holistic data perspective, is a problem-solver
* Knowledgeable in network, firewall, and security best practices
* Extensive experience with infrastructure automation and monitoring distributed systems
* Demonstrated ability to understand and solve deep technical issues
* Prior experience with cloud migrations a plus
* Strong software development and project management fundamentals


Job tags: Debian High availability Kubernetes Linux Mesos Virtualization VMware