Site Reliability Engineer
BitMEX is the world’s leading cryptocurrency derivatives trading platform, which has pioneered cryptocurrency trading through relentless commitment to change, and continues to set benchmarks for innovation, liquidity, and security today.
As the world's most advanced peer-to-peer crypto-products trading platform and API, BitMEX gives knowledge, confidence, and precision to hundreds of thousands of traders, transacting billions of USD a day.
Join us, as we build a thriving cryptocurrency ecosystem through strategic investments in emerging cryptocurrency technology, and create the future of digital financial services.
Purpose of This Role
This role is responsible for keeping all application services and production systems running smoothly at all times, and developing disaster recovery capabilities by designing and implementing fault tolerant architectures and procedures that meet our RPO and RTO requirements.
This role will work closely with ETT development teams to ensure our infrastructure is monitored so we become aware of potential infrastructure faults early, so we are able to react before our customers are impacted, preventing or minimising any downtime.
- Own the implementation of our RPO/RTO requirements.
- Working with development teams to ensure our applications are monitored in a consistent manner.
- Ownership of post-incident reviews and ensuring any action items are prioritised and fully remiated.
Required Skills & Competencies
- 6+ years of professional experience, with a proven track record of designing, implementing, managing, and testing infrastructure at scale on AWS.
- Have experience designing, planning and carrying out data centers to match core software requirements in the framework of cross-provider & cross-continental deployments for disaster recovery purposes
- Have good experience with low-latency, high throughput & highly-available networks, spanning regions.
- Have experience with Chef and Terraform.
- Have a detail-oriented mindset considering edge cases, failure modes, behavioral patterns before all.
- Strong engineering skill set with a firm grasp of fundamental Computer Science principles and a modular, maintainable, agile & test-driven approach to software development.
- Strong technical troubleshooting, diagnosing and problem solving skills, capacity to multitask and give equal attention to a variety of functions while under pressure
- Ability to work independently and comfortably to tight schedules.
Explore more DevOps, Cloud and SRE career opportunities
- Open Lead DevOps Engineer Jobs
- Open Staff, Product Manager - Global Infrastructure Jobs
- Open IT DevOps Engineer Jobs
- Open Manager of DevOps & Engineering Infrastructure Jobs
- Open Site Reliability Engineer II Jobs
- Open Data Platform Engineer Jobs
- Open Senior Automation Engineer Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Principal Cloud Architect Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Reliability Engineer Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Sr Software engineer (Infrastructure) Jobs
- Open Senior Security Automation Engineer Jobs
- Open Staff DevOps Engineer Jobs
- Open Software Development Engineer, AWS Security Jobs
- Open QA Automation Engineer - Workforce Engagement Management Jobs
- Open Senior Infrastructure Security Engineer Jobs
- Open DevOps/Configuration Management Specialist Jobs
- Open Lead Site Reliability Engineer Jobs
- Open Senior Software Development Engineer, AWS Security Jobs
- Open Cloud DevOps Systems Engineer Jobs
- Open Senior Devops Engineer Jobs
- Open MySQL-related jobs
- Open REST-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open Elasticsearch-related jobs
- Open Virtualization-related jobs
- Open High availability-related jobs
- Open Golang-related jobs
- Open Reliability engineering-related jobs
- Open EC2-related jobs
- Open VMware-related jobs
- Open Redis-related jobs
- Open JS-related jobs
- Open MongoDB-related jobs
- Open Node-related jobs
- Open Jenkins-related jobs
- Open Grafana-related jobs
- Open Gitlab-related jobs
- Open PostgreSQL-related jobs
- Open Perl-related jobs
- Open Web applications-related jobs
- Open Vault-related jobs
- Open Spark-related jobs