Senior Site Reliability Engineer
London, England, United Kingdom
LiveRamp powers exceptional experiences by making it safe and easy to connect the world's data, people, and applications. We are the industry pace-setter and one of the fastest growing SaaS businesses—the enabling product behind many of the world's biggest brands and technology platforms.
The Global SRE team is responsible for owning and supporting deployments of global products, and providing first line operational support. We are looking for a Senior Site Reliability engineer who is excited about establishing and advocating for best practices for product deployments and SRE. You will be able to leverage your software engineering expertise to understand the needs of teams and guide them in improving their systems.
- Support and/or own the deployment of global products including setting up production and internal environments
- Provide 24/7 first line of Engineering support (via follow the sun teams in all regions) for any issues related to global product deployment, availability and internal operations support.
- Drive effective resolutions of core product issues with Engineering teams
- Setup and maintain Infrastructure & Product Reliability monitoring and alerting
- Maintain and enhance CI/CD Tooling and Terraform scripts in support of the mission in close collaboration with DevOps team
- Maintain and enhance Engineering Operational Documentation for supported products.
- Provide expertise to build and maintain products operational documentation and setting up product SRE practises
- Support Security and Compliance governance support in production environments
- Work in close collaboration with SRE team members and Engineering organizations based in California, Paris, Nantong, Singapore, Australia and others.
- 5+ years of experience in the fields of SRE, DevOps or production engineering
- Experience with continuous integration and automation server like Jenkins, CircleCI
- Experience with platforms like Kubernetes, Containers and public clouds
- Experience with deployment and monitoring of highly scalable products
- Experience with one or more object-oriented programming skills, i.e. Java, Python, Go or scripting (Groovy, Shell)
- Experience with SRE best practices, working knowledge of observability principles is a big plus
- Ability to lead and mentor other engineers in the team for SRE best practices
- Ability to diagnose technical problems, debug code, and automate routine tasks
- Experience with securing systems in a public cloud environment
- Understands how to engage other engineers as stakeholders
- Enjoy working as part of a distributed team: smart, ethical, friendly, hard-working, and productive