Principal Site Reliability Engineer
We are a fast-growing and pioneering people analytics company that is transforming the financial workplace. We use cutting-edge software and machine learning to generate previously unidentifiable insights into employee behavior and performance. We have been recognized by renowned companies such as Amazon Web Services and Google Cloud for our achievements in AI, big data analytics, and machine learning. We have also been included in the Forbes FinTech 50, CB Insights AI 100, and Tech Nation’s prestigious Future 50 program.
Our goal is to help businesses achieve better outcomes by developing and delivering data-driven solutions for compliance, CRM, HR, and workplace productivity. We also aim to rapidly expand our worldwide customer base to include companies across all major industries.
About the Role
The Behavox Platform is a scalable, fault-tolerant and highly performant storage and processing system which allows us to manage and analyze massive volumes of data. We have an extensive and flexible set of APIs to develop products that allow our clients to work through millions of data items, by searching, filtering, and visualizing relationships between entities in the system. For our most demanding users (including ourselves), we have built our own IDE!
As a Principal SRE you will be responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of all production systems and services. You will work together with other ProdOps, Product and Engineering leaders to design and implement SRE practice at Behavox to build foundational infrastructure allowing to support the rapid growth of the Behavox client base.
This is an incredible opportunity to discover the world of high-load data processing and face the challenges of distributed Big Data systems. It will also provide you the opportunity to:
1. Work on critical business areas that will have a big impact on the company
2. Implement your ideas in an environment that is looking to constantly improve
3. Be part of a fast-growing dynamic company and with modern technologies
What You'll Bring
- 5+ years of experience as a SRE/DevOps engineer , expert knowledge of SRE practices (design and improvement of application observability, error budgets, service level indicators (SLIs), objectives (SLOs), and agreements (SLAs))
- Automation skills - Ansible/SaltStack or equivalent tools
- Knowledge of programming languages (Python / Go / Java)
- Linux system administration expertise, solid knowledge of Linux OS fundamentals
- Skills in Public Clouds (AWS / GCP / Azure)
What You'll Do
- Design and implement SRE practices ensuring availability, scalability and observability of production systems with a strong focus on excellent customer experience
- Take responsibility for the SaaS deployments and their frequent releases of the complex distributed large-scale software
- Create tooling to automate the operations (Python/Golang, Ansible/Salt)
- Maintain production clusters and L3 support for SaaS customers: software and infrastructure updates, troubleshooting, patch and vulnerability management
- Manage the infrastructure of large numbers of Linux servers
What We Offer
- A truly global mission with a passionate community in locations all over the world
- Huge impact and learning potential as our aspirations require bold innovation
- Highly competitive compensation with 100% bonus pay already integrated
- Benefits include fully covered health coverage for employee and family
- Generous time-off policy and flexible work schedule
About Our Process
We take Talent very seriously and we are building a community of extraordinary individuals working together in very high performing teams. We also know that the best Talent always has options so we believe that the process has to be a two way assessment - the company AND the candidate assessing the business needs alignment, the career next step alignment, and the cultural alignment.
During the process we will begin by exploring the core factors regarding salary and location along with core experience and skills and values alignment. We will then deep dive explore the critical technical competencies we have identified for the role, and then we will deep dive in behavioral competencies.
The most aligned candidate will then be asked to do a practical work task simulation activity so we can make sure that you will enjoy the kind of work the role requires, and this task will typically be presented and discussed with a group of colleagues and managers.