Site Reliability Engineer
BitMEX is the world’s leading cryptocurrency derivatives trading platform, which has pioneered cryptocurrency trading through relentless commitment to change, and continues to set benchmarks for innovation, liquidity, and security today.
As the world's most advanced peer-to-peer crypto-products trading platform and API, BitMEX gives knowledge, confidence, and precision to hundreds of thousands of traders, transacting billions of USD a day.
BitMEX explores, incubates, and pursues opportunities and investments, as part of its mission to reshape the modern digital financial system into one which is inclusive and empowering. BitMEX is a pioneer in the industry whose trading platform handles tens of thousands of low latency transactions per second, representing several billions of dollars traded every day.
The BitMEX BI team is responsible for the reliability and scalability of all the services that power the BitMEX exchange, and for providing solutions to our application and business teams. As a Site Reliability Engineer focused on helping app teams to adopt and evolve utilising industry standard tooling and ensuring you can provide solutions in a timely manner. Your day to day role will involve understanding how the applications work and how you can help evolve the stack to perform optimally.
- Working with application owners to ensure full visibility of the application stack.
- Improve observability instrumentation of the applications alongside the critical trading flow and its periphery - enabling end-to-end request tracing and prediction of future issues
- Collaborating with the Product Engineering, Trading Technology and Application Support teams to develop dashboards and integrations (e.g. logs / time-series cross-referencing) that allow quick identification of reliability/performance problems and guided drill down to accelerate incident management.
- Provide expert guidance and generate solutions for new business requirements. Utilising your knowledge, work to create a next generation ….
- Develop disaster recovery capabilities to ensure our business can continue to operate in the event of a technology failure
- 5 years of relevant experience with at least 3 years experience supporting production critical time-series databases (e.g. Influx, Prometheus, Graphite)
- 2 years cloud native experience (e.g. Kubernetes)
- Familiarity with or knowledge of Terraform (or similar product)
- Strong AWS, Linux/UNIX knowledge
- Experience working with offshore support teams
- Strong collaboration, analytical, verbal and written communication skills
- Experience working with offshore support teams
- Utilizes sound decision-making skills and communicates well with other team members and business users. Identifies problems and recommends solutions.
- Works in a team environment, including cross-functional teams and teams with business users throughout the company. Interacts with all levels of management and staff across the organization
- You are comfortable context-switching across a wide variety of platforms and technologies and are able to find ways to clue different technologies together
- You are comfortable managing a complex, polyglot, and global infrastructure as code, and you understand how to fully automate their management from a centralized git repository.
Join us, as we build a thriving cryptocurrency ecosystem through strategic investments in emerging cryptocurrency technology, and create the future of digital financial services.
Explore more DevOps, Cloud and SRE career opportunities
- Open Linux Infrastructure Developer Jobs
- Open Automation Engineer Jobs
- Open Reliability Engineer Jobs
- Open Data Platform Engineer Jobs
- Open Devops Engineer Jobs
- Open Lead Site Reliability Engineer Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Senior Infrastructure Security Engineer Jobs
- Open Senior Test Automation Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Sr. DevOps Engineer Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Principal Cloud Architect Jobs
- Open Senior Automation Engineer Jobs
- Open Site Reliability Engineer II Jobs
- Open Senior DevOps Engineer - Boston Hub Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Staff DevOps Engineer Jobs
- Open Senior Cloud Infrastructure Engineer Jobs
- Open Senior DevOps Engineer - New York Hub Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open DevOps Engineer II Jobs
- Open Senior Software Engineer - Site Reliability - Raleigh Hub Jobs
- Open Senior Software Engineer - Site Reliability - Boston Hub Jobs
- Open DevOps Manager - Boston Hub Jobs
- Open Kafka-related jobs
- Open REST-related jobs
- Open Unix-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open Elasticsearch-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open PowerShell-related jobs
- Open Golang-related jobs
- Open High availability-related jobs
- Open Virtualization-related jobs
- Open TCP-related jobs
- Open VMware-related jobs
- Open JS-related jobs
- Open EC2-related jobs
- Open Redis-related jobs
- Open Node-related jobs
- Open TCP/IP-related jobs
- Open Grafana-related jobs
- Open MongoDB-related jobs
- Open PostgreSQL-related jobs
- Open Gitlab-related jobs
- Open NoSQL-related jobs