Senior Site Reliability Engineer (SRE)
Kuala Lumpur, Kuala Lumpur, Malaysia
We are looking for top notch Senior Site Reliability Engineers or Sr. DevOps Engineer to contribute to the most advanced Tech projects in Mindvalley and streamline, scale and optimize our products and platforms.
Mindvalley is one of the leading and most promising ed-tech companies on the planet. We have dominated the US market for Personal Growth Education and created a brand that is now powering athletes in every major US sports team and learning in major companies. But we're more than that. We're currently working on the most advanced learning system on the planet - a version of Ironman's Jarvis that utilizes AI and augmented reality to provide customize learning to turn anyone into a superhero. We make people better humans in every aspect of life and we are seeking the best engineers on the planet to come together to build the most advanced education platform our species has seen. If we achieve our goal we will be powering 100 countries, every company in the Fortune 500 and moving humanity towards a better future for the human race.
As a Senior Site Reliability Engineer (SRE)/ DevOps Engineer you can expect to grow with an international team and work with state of the art tools and techniques. You will have the opportunity to work directly with our CEO and Tech leadership to bring to life the most impactful projects and ideas.
You must have a solution oriented mindset and you are always looking for the absolute best solutions to solve problems, and are even more productive in a collaborative team environment. In this role it's essential to execute and communicate fast, as you’ll work on high priority initiatives.
What makes Mindvalley one of the leading and most promising ed-tech?
- Collaborate with our business teams to build and run sustainable production systems, which can evolve and adapt to changes in our fast-paced, global business environment
- Work directly with customers/developers/architects/product owners to help automate customer issues and drive solutions to build stronger and more reliable solutions
- Engage in and improve our Digital product solutions — design, deployment, operation and continuous improvement
- Participate in system design consulting, platform management, and capacity planning
- Ensure Digital products are up and running by measuring and monitoring availability, performance and overall system health
- Engage with key vendors in assessing technology fit with Mindvalley’s future technology architecture and provide recommendations
- Complete pre-production validation activities such as system design consulting, developing software platforms and frameworks, capacity planning and Production Readiness reviews
- Balance feature development velocity and reliability with well-defined SLOs
- Run the Production environment by monitoring availability and taking a holistic view of system health
- Drive incident management process and support a blameless post-mortems culture
- Partner with development teams to improve services via rigorous testing and release procedures
- Create sustainable systems and services through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
- Bachelor’s degree in computer science or related technical field involving coding (e.g., physics or mathematics)
- 5+ years of work experience in distributed systems design, maintenance, and troubleshooting
- Several years of experience working in a DevOps environment or in software engineering in fast-moving technical environments, preferably start-ups / scale-ups
- Proven understanding of full Software Development Life Cycle (Waterfall and Agile)
- Demonstrable involvement with continuous improvement and automation initiatives
- Familiarity with infrastructure migration, virtualization, performance analysis, log storage systems, new functionality enablement
- Exposure to design/implementation of infrastructure, configuration, build, installation and running
- Experience working with one or more cloud providers (AWS, GCP, Azure), exposure to multi-cloud is a bonus
- Proven experience with key container orchestration technologies and containerization principles
- Experience defining and implementing CI/CD process and tools (i.e. Kubernetes Operators, Helm, Spinnaker, Weave & Terraform)
- Knowledge of production hosting DBs such as MySQL, PostGREs, MongoDB, neoj4s & Aerospike, and the impacts these have on application design
- Exposure to Agile practices (nice to have - scrum master certification)
- Demonstrated ability to make decisions in a fast-paced environment
On the personal side:
- You are excellent in both teamwork and independent delivery;
- You have a strong attention to detail,
- You work well under pressure developing key features for high volume business critical systems,
- You are available to start remotely within 1-2 months.