Site Reliability Engineer SRE - Cloud Distributed Architecture
Barcelona, Catalonia, Spain
Hi, we're UserZoom - nice to meet you! If you’ve never heard of us before, we help companies get the user experience (UX) insights they need to deliver great digital experiences at scale through our all-in-one software platform (both web and mobile) and professional services teams.
We believe that every company will soon be a digital experience company, and we want to help make those digital experiences better. We do this by providing UX insights to some of the biggest brands in the world so they, in turn, can improve the experience they give their customers.
Check out more info here: www.userzoom.com
As part of the Engineering Team, you will have a great opportunity to contribute to building on our SaaS platform. Working alongside our global team (our UZ family can be found in the US, UK, Spain, and Poland) you will be responsible for creating something truly amazing - the UX industry is an exciting place to be right now. As UserZoom grows, so does our focus on your career and personal development. And more importantly, the team here at UserZoom is like a family - we're both supportive and welcoming of new team members.
- Daily analyzing and designing reliable & scalable Engineering solutions.
- Curating the Production environment by monitoring availability and taking a holistic view of system health.
- Scaling systems through automation, improving velocity and reliability.
- Driving incident management process and supporting a blameless postmortems culture.
- Being part of the Development team and working in close collaboration with the Operations team to improve services via rigorous testing and release procedures.
- Defining, improving, and engaging in adapting architectural application bottlenecks as observed.
- Troubleshooting, evaluating and resolving operational challenges, and contributing to defined SLO's.
- Managing availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.
- Creating sustainable systems and services through automation and uplifts.
- Advocating for SRE & DevOps Best Practices in a Cloud Microservices architecture.
A few things we require for this position
- Strong sense of ownership, customer service, and integrity demonstrated through clear communication.
- Understanding of full-stack Observability, Instrumentation, telemetry and resilience patterns.
- Expertise with cloud-continuous deployment based software development lifecycles.
- Solid understanding of the Linux Operating System, including Kernel, Memory, Process, Threads, Static / Shared Libraries, IPC, Signals.
- Understanding of networking and protocols such as: HTTP, HTTP2, DNS, ECMP, TCP/IP, ICMP, the OSI Model, and Load Balancing strategies.
- Familiarity with distributed systems, including Microservices, Service Mesh, Serverless, and Event-Driven architectures.
- Working knowledge and understanding of OODBMS, in particular MongoDB.
- Familiarity with the AWS toolset.
- Proficient with programming languages: Python, bash, Node JS is a valuable bonus.
- Experience with Kubernetes, Docker, Envoy, Istio, Ambassador and/or Kafka.
- Passion for eliminating repetitive manual processes using automation.
- Fluent English.
- Career plan
- Opportunity to grow professionally in a challenging environment
- International working environment