Site Reliability Engineer
Remote - London, England, United Kingdom
We’re looking for a Site Reliability Engineer to help us scale our platform and serve our fast-growing global community of learners.
As part of the Scalable Platform team, you will work closely with our Software Engineers, Data Engineers, Technical Architects, Information Security Manager and Director of Technology to ensure that the FutureLearn platform is secure, robust, and scalable. You will also build and maintain internal tooling to help our developers test, deploy and debug their code with ease.
Our Technology Stack
Our platform is built on AWS and we manage our infrastructure as code using Terraform. We run everything in containers with ECS and use GitHub Actions for CI/CD, deploying multiple times per day.
The website itself is written in Ruby on Rails and React, and we use unit, integration and acceptance tests to drive design and keep everything working. We are in the process of separating our code into separate backend and frontend apps, with a GraphQL API on the backend.
We use Datadog and Scout to monitor our application and infrastructure. We also have a very popular internal developer CLI tool written in Go.
How the Technology Team works
At FutureLearn we work in multidisciplinary product teams, collaborating with designers, engineers, product managers and stakeholders. We work in short sprints and regularly share, reflect on and iterate on our work. We care about work/life balance and supporting learning at work.
As a Site Reliability Engineer at FutureLearn, you will:
- Be responsible for maintaining the reliability of our platform infrastructure.
- Help set the technical direction for our infrastructure, ensuring that it continues to scale to support our growth in a cost effective manner.
- Ensure we have sufficient logging, monitoring and alerting capabilities to know when the platform is experiencing abnormal performance and to be able to identify the underlying causes.
- Ensure our platform is robust enough to withstand spikes in demand and malicious attacks.
- Respond to incidents affecting the platform, including being part of the on-call rota.
- Support the Information Security Manager in setting up and remediating issues arising from pen tests and other security exercises.
- Perform regular disaster recovery exercises to ensure we can recover in the case of an incident.
- Build a DevOps culture at FutureLearn.
- Improve the process of developing, testing and continuously deploying the FutureLearn application so that it’s safer, faster and easier for engineers to work on.
- Empower software engineers to understand how to get their code into production, and how to identify and debug performance issues.
- Support software engineers by pairing, teaching, mentoring, coaching, reviewing code and demonstrating the practices of an effective engineer.
- Experience architecting and supporting cloud-native web application infrastructure.
- Experience working with containers and schedulers (we use Amazon ECS).
- Experience using automated config management systems to manage and version cloud instances (we use Terraform).
- A deep understanding of Linux, networking and security.
- Experience supporting database administration and performance, taking into account scalability and maintainability.
- A keen interest in automating processes and improving the developer experience.
- Experience working closely with software engineers in an agile environment.
- A good understanding of git and how to use version control to effectively structure and communicate your work.
Any of the following is a bonus:
- Experience managing relationships with suppliers (e.g. AWS, Cloudflare)
Above all, we are looking for people who are curious, think critically, are eager to learn and keen to use their experience to help and support others. You will need to be able to communicate and explain things clearly and work well in a collaborative environment.
- 28 days holiday (plus 8 days public holiday)
- Buy & sell up to 5 days holiday
- Charity day (volunteer for a charity of your choice)
- Cycle to work scheme, and secure bike parking and showers in the office
- Season Ticket loan
- Flexible working environment/hours
- Pension (4% employer / employee contribution)
- OU Staff Fee Course Waiver Programme
- Great coffee, teas, fruit and daily breakfast
FutureLearn is a leading social learning platform formed in December 2012 by The Open University and is now jointly owned by The Open University and The SEEK Group. FutureLearn has over 10 million people signed up worldwide. FutureLearn uses design, technology and partnerships to create enjoyable, credible and flexible short online courses, microcredentials, as well as undergraduate and postgraduate degrees that improve working lives. It partners with over a quarter of the world’s top universities, as well as organisations such as Accenture, the British Council, CIPD, Raspberry Pi and Health Education England (HEE). It’s also involved in government-backed initiatives to address skills gaps such as The Institute of Coding and the National Centre for Computing Education.
Please use our online form by pressing 'Apply for this job' below, including your CV and a cover letter telling us why you'd like to come work with us.
How we assess candidates
We use a set of competencies to evaluate candidates throughout the interview process: communication, initiative, teamwork, curiosity and technical skill. You can read more about these in our blog post about our hiring framework.
Please contact firstname.lastname@example.org if you require any reasonable adjustments or alterations to be made, to support you through the recruitment process.
We value all the great benefits that diversity brings and encourage everyone to bring their whole self at work, regardless of gender, religion, ethnicity, sexual orientation, age or disability.
We encourage people who have been made redundant as a result of COVID-19 to apply for opportunities at FutureLearn. We believe that in these difficult times, good employers have to rise to the occasion and play their part in the community. At FutureLearn, we take care of each other.