Senior Site Reliability Engineer, Monitoring

Remote, United States

Full Time Senior-level / Expert
Wayfair Inc. logo
Wayfair Inc.
Apply now Apply later

Senior Site Reliability Engineer, Monitoring

Wayfair is a leader in the e-commerce space for all things home. We live and breathe modern technologies. This role can be 100% remote.

We’re looking for a smart, driven and passionate engineer to be part of the observability platform team. The observability platform at Wayfair is composed of complex distributed systems and data pipelines built mainly using Grafana, InfluxDB, Prometheus (TSDB), Elastic Stack (formerly ELK), Apache Kafka and Tremor (in-house event processing system built initially for our logging needs and now open-source!). We collect upwards of 10 billion log events per and 17 billion metrics per day,  generated by 20k+ systems and 500+ homegrown applications across multiple geo locales and GCP regions, while supporting queries against these datasets to provide proper visibility to our consumers (3k and growing). .

On the Monitoring Platform team as a Senior Engineer, you’ll have plenty of opportunities to share your strengths as well as build others while contributing to various mature as well as emerging open-source projects. You will work in a global team, with on-premise and cloud-based deployments in an inclusive environment. If this sounds like fun to you, please continue reading and apply!

What You’ll Do

  • Drive the design of various system components, infrastructure, and tools being written primarily in the Go programming language, keeping performance and scalability in mind
  • Participate in code reviews, systems design and architectural sessions to ensure that our platform and supporting services are developed/deployed using best practices. 
  • Interface with business product leaders and engineers to gather requirements on various projects and translate requirements into system design as the platform sees more use throughout the company
  • Contribute and maintain to our existing documentation platform for use with onboarding new engineers and providing self service to our consumers. 
  • Build and grow our team by mentoring/growing junior engineers leading by example to implement industry standards and best practices in software engineering and infrastructure. Influence the long term roadmap of what the observability platform team looks like and contribute your ideas directly to the stack. Test the limits of various open-source components we use, identify opportunities to improve them and work on the implementation of the identified improvements as needed/when feasible


What You’ll Need

  • 4+ years of experience in systems and software engineering, as well as SRE/DevOps paradigms
  • Experience writing production-ready, well-crafted applications and services using Golang
  • Experience in scripting languages used in the infrastructure space (Python, Ruby, Bash etc.) as well as familiarity with version control systems such as Git.
  • 2+ years of hands-on experience with distributed systems like Elastic Stack (ELK Stack), Kafka, NoSQL and TSDBs.
  • 2+ years of working with configuration management and orchestration tools such as Puppet, Chef, Ansible and Terraform.
  • Experience growing a team by mentoring junior engineers and help develop their skills while assisting them on projects
  • Efficient at prioritizing different tasks based on their relative importance in a fast-paced production environment

About Wayfair Inc.

Wayfair is one of the world’s largest online destinations for the home. Whether you work in our global headquarters in Boston or Berlin, or in our warehouses or offices throughout the world, we’re reinventing the way people shop for their homes. Through our commitment to industry-leading technology and creative problem-solving, we are confident that Wayfair will be home to the most rewarding work of your career. If you’re looking for rapid growth, constant learning, and dynamic challenges, then you’ll find that amazing career opportunities are knocking.

No matter who you are, Wayfair is a place you can call home. We’re a community of innovators, risk-takers, and trailblazers who celebrate our differences, and know that our unique perspectives make us stronger, smarter, and well-positioned for success. We value and rely on the collective voices of our employees, customers, community, and suppliers to help guide us as we build a better Wayfair – and world – for all. Every voice, every perspective matters. That’s why we’re proud to be an equal opportunity employer. We do not discriminate on the basis of race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, or genetic information.

Job region(s): Remote/Anywhere North America
Job stats:  2  0  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities