Staff Site Reliability Engineer
Redwood City, CA
We’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and will also believe that automation is a key component to operating large-scale systems.
- Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services.
- Gain deep knowledge of our complex applications.
- Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth.
- Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX environment.
- Work closely with development teams to ensure that platforms are designed with "operability" in mind.
- Function well in a fast-paced, rapidly-changing environment.
- Participate in a 24x7 on-call rotation.
- Familiarize with Poshmark tech stack and functional requirements.
- Get comfortable with automation tools/frameworks used within cloudops organization and deployment processes associated with.
- Gain in depth knowledge related to related product functionality and infrastructure required for it.
- Start contributing by working on small to medium scale projects.
- Understand and follow on call rotation as a secondary to get familiarized with the on call process.
12+ Month Accomplishments
- Execute projects related to comms functionality, independently, with little guidance from lead.
- Create meaningful alerts and dashboards for various sub-system involved in targeted infrastructure.
- Identify gaps in infrastructure and suggest improvements or work on it.
- Get involved in on-call rotation.
- 5+ years of experience in Systems Engineering/Site Reliability Operations role is required, ideally in a startup or fast-growing company.
- 5+ years in a UNIX-based large-scale web operations role.
- 5+ years of experience in doing 24/7 support for large scale production environments.
- Battle-proven, real-life experience in running a large scale production operation.
- Experience working on cloud-based infrastructure e.g AWS, GCP, Azure.
- Hands-on experience with continuous integration tools such as Jenkins, configuration management with Ansible, systems monitoring and alerting with tools such as Nagios, New Relic, Graphite.
- Experience scripting/coding
- Ability to use a wide variety of open source technologies and tools.
Technologies we use:
- MongoDB, RabbitMQ, Redis, ElasticSearch.
- Amazon Web Services (EC2, RDS, CloudFront, S3, etc.)
- Terraform, Packer, Jenkins, Datadog, Kubernetes, Docker, Ansible and other DevOps tools.
Poshmark is a leading social marketplace for new and secondhand style for women, men, kids, pets, home, and more. By combining the human connection of physical shopping with the scale, ease, and selection benefits of ecommerce, Poshmark makes buying and selling simple, social, and sustainable. Its community of more than 80 million registered users across the U.S., Canada, Australia, and India, is driving the future of commerce while promoting more sustainable consumption. For more information, please visit www.poshmark.com, and for company news and announcements, please visit investors.poshmark.com. You can also find Poshmark on Instagram, Facebook, Twitter, Pinterest, and YouTube.
At Poshmark, we’re constantly challenging the status quo and are looking for innovative and passionate people to help shape the future of Poshmark. We’re disrupting the industry by combining social connections with e-commerce through data-driven solutions and the latest technology to optimize our platform. We’re nothing without our amazing team who deliver an unparalleled social shopping experience to the millions of people we connect each day.
We built Poshmark around four core values: 1) focus on people to create empowered communities that drive success; 2) together we grow to support each other to strive for our dreams; 3) lead with love to foster genuine connections built upon a foundation of respect; and 4) embrace your weirdness to accept and empower one another on their own unique journey. We’re invested in our team and community, working together to build an entirely new way to shop. That way, when we win, we all win together. Come help us build the most connected shopping experience ever.
Here’s what we’ll set you up with:
- A team that is invested in your career growth and training
- Competitive salary and equity, based on experience
- Company sponsors up to 100% cost for your health, dental and vision plans and up to 90% for your dependents
- Work alongside world-class talent
- Flexible vacation / paid time off policy
- Parental leave
- Personal style encouraged (or not, whatever you’re in to)
Poshmark is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Explore more DevOps, Cloud and Digital Infrastructure career opportunities
- Open Sr. DevOps Engineer jobs
- Open Senior Cloud Security Engineer jobs
- Open Lead Site Reliability Engineer jobs
- Open Cloud Automation Engineer jobs
- Open Senior Software Engineer - Site Reliability jobs
- Open Senior Test Automation Engineer jobs
- Open IT DevOps Engineer jobs
- Open Manager of DevOps & Engineering Infrastructure jobs
- Open Linux Infrastructure Developer jobs
- Open Senior Cloud Infrastructure Engineer jobs
- Open Staff, Product Manager - Global Infrastructure jobs
- Open Senior Software Engineer DevOps (remote) jobs
- Open Staff Platform Engineer jobs
- Open Lead DevOps Engineer jobs
- Open Reliability Engineer jobs
- Open Junior DevOps Engineer jobs
- Open Senior Infrastructure Security Engineer jobs
- Open Staff DevOps Engineer jobs
- Open Senior Cloud Architect jobs
- Open DevOps/Configuration Management Specialist jobs
- Open Senior Automation Engineer jobs
- Open Senior Site Reliability Engineer (SRE) jobs
- Open Devops Engineer jobs
- Open Data Infrastructure Engineer jobs
- Open Senior Software Engineer - Site Reliability - Raleigh Hub jobs
- Open Kafka-related jobs
- Open REST-related jobs
- Open Unix-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open Elasticsearch-related jobs
- Open DNS-related jobs
- Open Golang-related jobs
- Open S3-related jobs
- Open PowerShell-related jobs
- Open Jira-related jobs
- Open TCP-related jobs
- Open High availability-related jobs
- Open EC2-related jobs
- Open Grafana-related jobs
- Open Redis-related jobs
- Open JS-related jobs
- Open Virtualization-related jobs
- Open TCP/IP-related jobs
- Open Node-related jobs
- Open MongoDB-related jobs
- Open VMware-related jobs
- Open PostgreSQL-related jobs
- Open Gitlab-related jobs