DevOps Engineer - 12 month Fixed term contract
Hatfield, Hertfordshire, UK
“We are on a mission to transform the future of grocery retail through sustained technology innovation.”
Ocado Technology is putting the world’s retailers online using the cloud, robotics, AI, and IoT. We develop the innovative software and systems that power Ocado.com, the world’s largest online-only grocery retailer as well as the global ‘Ocado Smart Platform’. With everything from websites to fully autonomous warehouse that we design in-house, our employees need to be specialists in a wide range of technologies to help drive our business.
We champion a value-led culture to get our teams working at their very best and to help create a collaborative working environment that our people love. Core values of Trust, Autonomy, Craftsmanship, Collaboration and Learn Fast help drive our innovative culture. But don’t just take our word for it, have a look at what our people are saying about us on Glassdoor.
What does the Cloud Platform team do?
The Cloud Platform teams within the Private & Edge Cloud department, provision and maintain more than 15 kubernetes environments, have a very large portfolio, and is responsible for maintaining multiple UK and international CFC as well as supporting commissioning of new ones. To satisfy the recent reorganisation and meet all the company’s objectives, the team needs to reduce the portfolio (simplify the current solution, use fewer tools, etc), and clearly grow in size. The team has recently split into two teams with 5 engineers each to allow engineers to focus on different equally important priorities
The mission of the team is to deliver an ever more reliable and scalable on-premise ecosystem for low-latency services, automation and edge devices that is fully adapted to the complete gamut of CFC sizes and geographies (UK included), while ensuring smooth transition to OCEngO cloud platform and providing uninterrupted service for existing sites
What will you be doing...
Its main responsibilities are:-
- Contribute to operate the production systems and provide visibility of the state and health of all components.
- Responsible for site reliability (ie. availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning) of all production sites.
- Managing the deployment pipelines and repositories to deploy into multiple remote sites and public cloud environments.
- Designing and maintaining services that allow the automated deployment and management of our systems (ELK, NoSQL and relational database technologies).
- Providing tools to effectively monitor, manage, alert on and performance tune Ocado's database estate.
- Establishing and implementing automated processes to ensure that we can seamlessly update and patch our databases to protect against vulnerabilities in accordance with Ocado's data security policies.
- Actively contributing to the process of continual improvement, with regard to self, team and systems.
- Root cause analysis and fix production issues as we strive to have no support requirement and want our infrastructure to self-heal
- Supporting production systems as required outside of standard working hours and participating in 24x7 on-call rota.
Please note that we are looking for a team of people to work on a Fixed Term Contract
We’d like to talk to you if you have:
- Demonstrable hands-on experience maintaining and administering full ELK stack
- Demonstrable hands-on experience using git or similar version control systems
- Demonstrable hands-on experience creating and operating Docker containers in a kubernetes environment
- Demonstrable hands-on experience with scripting languages in a Linux environment (primarily Bash and Python)
- Demonstrable hands-on experience of common build tools, repositories and CI/CD tooling.
- Experience using monitoring and alerting tools (e.g. NewRelic, Prometheus, Grafana)
- Strong written and verbal communication skills
- Passion for open source technologies
- Basic knowledge of Go
- The inclination and ambition to “Automate Everything”
- Exposure to container orchestration systems such as Kubernetes
- Understanding of database technologies (e.g. Postgres and cassandra)
- Understanding of virtualisation technologies (e.g. Openstack)
- Grasp of networking fundamentals
- Knowledge of SCRUM or other Agile methodologies
- Experience with supporting production environments - OOH rota, etc
What we offer you...
Our employee benefits are designed for you, we care about people and we’ve ensured we have a wealth of benefits that focus on your well-being. Within our flexible environment we can offer technically stretching work, a competitive salary and share schemes. Benefits include pension scheme, train season ticket loan (interest-free), free shuttle bus from Hatfield train station and of course, healthy Ocado retail staff discounts.
We also have regular divisional socials, sports clubs not to mention the Ocado Technology Academy for a packed schedule of courses, conferences and events such as discussion sessions, conference briefs and external guest speakers. If you think you have what it takes to make a difference, please submit your application below.
Due to the energising nature of Ocado's business, vacancy close dates, when stated, are indicative and may be subject to change so please apply as soon as possible to avoid disappointment.
Please note: If you have applied and been rejected for this role in the last 6 months, or applied and been rejected for a role with a similar skill set, we will not re-evaluate you for this position. After 6 months, we will treat your application as a new one.
Be bold, be unique, be brilliant, be you. We are looking for individuality and we value diversity above gender, sexual orientation, race, nationality, ethnicity, religion, age, disability or union participation. We are an equal opportunities employer and we are committed to treating all applicants and employees fairly and equally.#LI-DT1