Site Reliability Engineer

Hong Kong

BitMEX logo
Apply now Apply later

The Company

BitMEX is the world’s leading cryptocurrency derivatives trading platform, which has pioneered cryptocurrency trading through relentless commitment to change, and continues to set benchmarks for innovation, liquidity, and security today.

As the world's most advanced peer-to-peer crypto-products trading platform and API, BitMEX gives knowledge, confidence, and precision to hundreds of thousands of traders, transacting billions of USD a day.

Join us, as we build a thriving cryptocurrency ecosystem through strategic investments in emerging cryptocurrency technology, and create the future of digital financial services.

Purpose of This Role

This role is responsible for keeping all application services and production systems running smoothly at all times, and developing disaster recovery capabilities by designing and implementing fault tolerant architectures and procedures that meet our RPO and RTO requirements.

This role will work closely with ETT development teams to ensure our infrastructure is monitored so we become aware of potential infrastructure faults early, so we are able to react before our customers are impacted, preventing or minimising any downtime. 

Key Responsibilities

  • Own the implementation of our RPO/RTO requirements.
  • Working with development teams to ensure our applications are monitored in a consistent manner.
  • Ownership of post-incident reviews and ensuring any action items are prioritised and fully remiated.

Required Skills & Competencies

  • 6+ years of professional experience, with a proven track record of designing, implementing, managing, and testing infrastructure at scale on AWS.
  • Have experience designing, planning and carrying out data centers to match core software requirements in the framework of cross-provider & cross-continental deployments for disaster recovery purposes
  • Have good experience with low-latency, high throughput & highly-available networks, spanning regions.
  • Have experience with Chef and Terraform.
  • Have a detail-oriented mindset considering edge cases, failure modes, behavioral patterns before all.
  • Strong engineering skill set with a firm grasp of fundamental Computer Science principles and a modular, maintainable, agile & test-driven approach to software development.
  • Strong technical troubleshooting, diagnosing and problem solving skills, capacity to multitask and give equal attention to a variety of functions while under pressure
  • Ability to work independently and comfortably to tight schedules.
Job tags: AWS Chef React Terraform
Job region(s): Asia/Pacific
Job stats:  1  0  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities