Site Reliability Engineering, Sustaining Engineer - Wayfair Operations Center

Boston, MA

Wayfair Inc. logo
Wayfair Inc.
Apply now Apply later

Posted 2 weeks ago

Wayfair is the online leader for all things home. Through technology and innovation, Wayfair makes it possible for shoppers to quickly and easily furnish the home of their dreams, across a selection of millions of items. Our platform is technology driven, leveraging best in class solutions with a constant drive for innovation.

Who We Are

The Sustaining Engineering and Operations team is responsible for streamlining, automating, and enabling cloud-scale growth across Wayfair engineering in a technically sustainable manner. Our team has a presence in both Boston and Berlin, supporting our core regions of operation. The team is looking to add a strong engineer to help us to build a better way of doing things. We’re looking for someone who is as excited as we are about building that next-generation environment - which is automation heavy and as dynamic as possible. This role will give exposure to our full technology stack, and an opportunity to drive tangible improvements to a world class eCommerce platform.

What You’ll Do

  • Respond to and investigate technical issues of all sizes and types
  • Automate or streamline manual tasks and redundancies within the infrastructure organization (and never solve the same problem twice!)
  • Improve monitoring and alerting through a proactive approach to catch and fix infrastructure, code, and database issues before they cause impact
  • Drive efficiency and proactive monitoring & alerting across all of Wayfair’s platforms
  • Provide a level of support and visibility across our full stack, learning how our different systems interrelate and impact each other.
  • Write and refine our troubleshooting guides, documentation, and best practices

What You’ll Need

  • 3+ years of experience in a NOC, SRE, Platform, or Operational Engineering role
  • Strong knowledge of Linux fundamentals, specifically OS configuration and tuning
  • Experience with Python scripting or Puppet / Terraform / Ansible automation
  • Experience with Datadog or similar monitoring or tracing tools

Nice To Have

  • Previous experience in an eCommerce or services based company
  • Experience managing code deploys (Git / Buildkite) a plus
  • Experience with Elastic’s ELK stack or similar logging platforms is a plus
  • Experience coding in any of Python, PHP, Java, Go, or similar

About Us

Wayfair is one of the world’s largest online destinations for the home. Whether you work in our global headquarters in Boston or Berlin, or in our warehouses or offices throughout the world, we’re reinventing the way people shop for their homes. Through our commitment to industry-leading technology and creative problem-solving, we are confident that Wayfair will be home to the most rewarding work of your career. If you’re looking for rapid growth, constant learning, and dynamic challenges, then you’ll find that amazing career opportunities are knocking. No matter who you are, Wayfair is a place you can call home. We’re a community of innovators, risk-takers, and trailblazers who celebrate our differences, and know that our unique perspectives make us stronger, smarter, and well-positioned for success. We value and rely on the collective voices of our employees, customers, community, and suppliers to help guide us as we build a better Wayfair – and world – for all. Every voice, every perspective matters. That’s why we’re proud to be an equal opportunity employer. We do not discriminate on the basis of race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, or genetic information.




Job tags: Ansible ELK Git Go Java Linux PHP Puppet Python Reliability engineering Terraform