Senior Site Reliability Engineer, Monitoring
Remote, United States
Senior Site Reliability Engineer, Monitoring
Wayfair is a leader in the e-commerce space for all things home. We live and breathe modern technologies. This role can be 100% remote.
We’re looking for a smart, driven and passionate engineer to be part of the observability platform team. The observability platform at Wayfair is composed of complex distributed systems and data pipelines built mainly using Grafana, InfluxDB, Prometheus (TSDB), Elastic Stack (formerly ELK), Apache Kafka and Tremor (in-house event processing system built initially for our logging needs and now open-source!). We collect upwards of 10 billion log events per and 17 billion metrics per day, generated by 20k+ systems and 500+ homegrown applications across multiple geo locales and GCP regions, while supporting queries against these datasets to provide proper visibility to our consumers (3k and growing). .
On the Monitoring Platform team as a Senior Engineer, you’ll have plenty of opportunities to share your strengths as well as build others while contributing to various mature as well as emerging open-source projects. You will work in a global team, with on-premise and cloud-based deployments in an inclusive environment. If this sounds like fun to you, please continue reading and apply!
What You’ll Do
- Drive the design of various system components, infrastructure, and tools being written primarily in the Go programming language, keeping performance and scalability in mind
- Participate in code reviews, systems design and architectural sessions to ensure that our platform and supporting services are developed/deployed using best practices.
- Interface with business product leaders and engineers to gather requirements on various projects and translate requirements into system design as the platform sees more use throughout the company
- Contribute and maintain to our existing documentation platform for use with onboarding new engineers and providing self service to our consumers.
- Build and grow our team by mentoring/growing junior engineers leading by example to implement industry standards and best practices in software engineering and infrastructure. Influence the long term roadmap of what the observability platform team looks like and contribute your ideas directly to the stack. Test the limits of various open-source components we use, identify opportunities to improve them and work on the implementation of the identified improvements as needed/when feasible
What You’ll Need
- 4+ years of experience in systems and software engineering, as well as SRE/DevOps paradigms
- Experience writing production-ready, well-crafted applications and services using Golang
- Experience in scripting languages used in the infrastructure space (Python, Ruby, Bash etc.) as well as familiarity with version control systems such as Git.
- 2+ years of hands-on experience with distributed systems like Elastic Stack (ELK Stack), Kafka, NoSQL and TSDBs.
- 2+ years of working with configuration management and orchestration tools such as Puppet, Chef, Ansible and Terraform.
- Experience growing a team by mentoring junior engineers and help develop their skills while assisting them on projects
- Efficient at prioritizing different tasks based on their relative importance in a fast-paced production environment
About Wayfair Inc.
Wayfair is one of the world’s largest online destinations for the home. Whether you work in our global headquarters in Boston or Berlin, or in our warehouses or offices throughout the world, we’re reinventing the way people shop for their homes. Through our commitment to industry-leading technology and creative problem-solving, we are confident that Wayfair will be home to the most rewarding work of your career. If you’re looking for rapid growth, constant learning, and dynamic challenges, then you’ll find that amazing career opportunities are knocking.
No matter who you are, Wayfair is a place you can call home. We’re a community of innovators, risk-takers, and trailblazers who celebrate our differences, and know that our unique perspectives make us stronger, smarter, and well-positioned for success. We value and rely on the collective voices of our employees, customers, community, and suppliers to help guide us as we build a better Wayfair – and world – for all. Every voice, every perspective matters. That’s why we’re proud to be an equal opportunity employer. We do not discriminate on the basis of race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, or genetic information.
Explore more DevOps, Cloud and SRE career opportunities
- Open Lead DevOps Engineer Jobs
- Open Staff, Product Manager - Global Infrastructure Jobs
- Open IT DevOps Engineer Jobs
- Open Manager of DevOps & Engineering Infrastructure Jobs
- Open Site Reliability Engineer II Jobs
- Open Data Platform Engineer Jobs
- Open Senior Automation Engineer Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Principal Cloud Architect Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Reliability Engineer Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Sr Software engineer (Infrastructure) Jobs
- Open Senior Security Automation Engineer Jobs
- Open Staff DevOps Engineer Jobs
- Open Software Development Engineer, AWS Security Jobs
- Open QA Automation Engineer - Workforce Engagement Management Jobs
- Open Senior Infrastructure Security Engineer Jobs
- Open DevOps/Configuration Management Specialist Jobs
- Open Lead Site Reliability Engineer Jobs
- Open Senior Software Development Engineer, AWS Security Jobs
- Open Cloud DevOps Systems Engineer Jobs
- Open Senior Devops Engineer Jobs
- Open MySQL-related jobs
- Open REST-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open Elasticsearch-related jobs
- Open Virtualization-related jobs
- Open High availability-related jobs
- Open Golang-related jobs
- Open Reliability engineering-related jobs
- Open EC2-related jobs
- Open VMware-related jobs
- Open Redis-related jobs
- Open JS-related jobs
- Open MongoDB-related jobs
- Open Node-related jobs
- Open Jenkins-related jobs
- Open Grafana-related jobs
- Open Gitlab-related jobs
- Open PostgreSQL-related jobs
- Open Perl-related jobs
- Open Web applications-related jobs
- Open Vault-related jobs
- Open Spark-related jobs