Senior Site Reliability Engineer, Observability

Boston, MA

Full Time Senior-level / Expert
Wayfair Inc. logo
Wayfair Inc.
Apply now Apply later

Posted 3 weeks ago

Senior Site Reliability Engineer, Observability

Wayfair is a leader in the e-commerce space for all things home. We live and breathe modern technologies. We are a “move fast break things, rethink old standards” team with a startup feel but working with platforms at a massive scale. This role can be 100% remote in the US.

We’re looking for a smart, driven and passionate engineer to be part of the observability platform team. The observability platform at Wayfair is composed of complex distributed systems and data pipelines built mainly using Grafana , InfluxDB, Elastic Stack (formerly ELK), Apache Kafka and Tremor (in-house event processing system built initially for our logging needs and now open-source!). We collect 10+ billion log events per day and 17+ billion metrics  generated by 20,000+ systems and 500+ homegrown applications across multiple geo locales and GCP regions, while supporting searches against these datasets for the purposes of engineering functions like monitoring, alerting, observability, high velocity software development and security incident event management.

On the Observability Platform team as a Senior Engineer, you’ll have plenty of opportunities to share your strengths as well as build others while contributing to various mature as well as emerging open-source projects. You will work in a global team, with on-premise and cloud-based deployments in an inclusive environment. If this sounds like fun to you, please continue reading and apply!

What You’ll Do

  • Drive the design of various system components, infrastructure, and tools being written primarily in the Go programming language, keeping performance and scalability in mind
  • Work with other software developers and operators to ensure that the system is developed and deployed using proper analysis, design, development and testing methodologies
  • Interface with business product leaders and engineers to gather requirements on various projects and translate requirements into system design as the platform sees more use throughout the company
  • Create and maintain detailed documentation for both self-service and onboarding
  • Help build and grow our team by mentoring junior engineers and nurture and develop their skills while assisting them on a variety of projects
  • Help determine the future roadmap of the logging team and ways we can improve the platform
  • Test the limits of various open-source components we use, identify opportunities to improve them and work on the implementation of the identified improvements as needed/when feasible

What You’ll Need

  • 4+ years of experience in systems and software engineering, as well as SRE/DevOps paradigms
  • Experience writing production-ready, well-crafted applications and services using Golang
  • Experience in scripting languages used in the infrastructure space (Python, Bash etc.) as well as familiarity with version control systems such as Git
  • 2+ years of hands-on experience with distributed technologies like Elastic Stack (ELK Stack), Kafka and Time Series Databases.
  • 2+ years of working with configuration management and orchestration tools such as Puppet, Ansible and Terraform
  • Experience with building up a team by mentoring junior engineers and help develop their skills while assisting them on projects
  • Efficient at prioritizing different tasks based on their relative importance in a fast-paced production environment

About Wayfair Inc.

Wayfair is one of the world’s largest online destinations for the home. Whether you work in our global headquarters in Boston or Berlin, or in our warehouses or offices throughout the world, we’re reinventing the way people shop for their homes. Through our commitment to industry-leading technology and creative problem-solving, we are confident that Wayfair will be home to the most rewarding work of your career. If you’re looking for rapid growth, constant learning, and dynamic challenges, then you’ll find that amazing career opportunities are knocking.

No matter who you are, Wayfair is a place you can call home. We’re a community of innovators, risk-takers, and trailblazers who celebrate our differences, and know that our unique perspectives make us stronger, smarter, and well-positioned for success. We value and rely on the collective voices of our employees, customers, community, and suppliers to help guide us as we build a better Wayfair – and world – for all. Every voice, every perspective matters. That’s why we’re proud to be an equal opportunity employer. We do not discriminate on the basis of race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, or genetic information.

Job tags: Ansible Apache Bash ELK GCP Git Go Golang Grafana Kafka Puppet Python Terraform
Job region(s): North America
Job stats:  1  0  0
  • Share this job via
  • or