Site Reliability Engineer
Our Technical Operations and SRE team supports the growing needs of the company by working collaboratively with multiple teams. We partner with the Product Engineering team to architect, maintain, and improve the reliability and performance of ZenGRC for our growing list of both on-prem and hosted customers. We are responsible for investing in the technology, processes, and automation to handle the growth of the company and maintain the security and integrity of our customers’ data.
What you will get to do:
- Short term
- Get up to speed with our software and systems architecture, including understanding the fundamentals of AWS and Kubernetes
- Maintain documentation (e.g., deployment procedures, disaster recovery, backup procedures, security settings)
- Build and deploy monitoring infrastructure
- Maintain and improve CI/CD infrastructure
- Ensure system integrity, safety, and security of customer data, and security of internal systems
- Plan and execute configuration change operations both at the application and the infrastructure level
- Long term
- Propose, plan, and execute on ideas and solutions to improve the overall effectiveness of the engineering organization
- Partner with multiple engineering teams to:
- Define Service Level Objects and appropriate availability guarantees for software
- Review architecture for new features and deployment requirements
- Plan capacity, provisioning, and change management
- Understand application security model and plan for mitigating possible threats
What we are looking for:
- 5+ years experience in a tech ops / SRE role
- Proven experience building, scaling and operating software in AWS
- Proficiency with Linux systems and at least one dynamic programming language (e.g., Go, Python)
- A working understanding of Docker, AWS, and MySQL/Postgres
- Demonstrated ability to prioritize and work in a complex environment
- The ability to separate good from perfect with a bias to get things done
- Proficiency in configuration management tools (e.g. Ansible, Chef, Puppet)
- Terraform or other infrastructure-as-code tools
- A strong desire to continue learning
- Nice to have:
What we offer:
- Flexible working time – we are remote friendly.
- Compensation in USD.
- Stock options
- A great and talented team to learn from and work with.
- Get to work on software used by top Silicon Valley Startups.
- Internal knowledge sharing sessions.