Software Engineer, Reliability (Agents of Webapp)
Denver, CO, USA (Remote Only)
About the Team
The Agents of Webapp is a Site Reliability Engineering (SRE) team managing Slack’s webserving infrastructure and dedicated to making Slack reliable. We continuously seek to improve the visibility, speed, and safety of Slack’s “webapp” runtime: components at the center of Slack’s distributed application architecture.
We are a growing and changing team, welcoming new perspectives and strategies to address evolving challenges to reliability. We collaborate with many engineering teams at Slack to continuously improve a shared runtime for the webapp: the infrastructure which enables continuous deployment of application code to meet the needs and expectations of millions of Slack users.
Slack has a positive, diverse, and supportive culture—we look for people who are curious, inventive, and work to be a little better every single day. In our work together we aim to be smart, humble, hardworking and, above all, collaborative. If this sounds like a good fit for you, why not say hello?
What you will be doing
- You will directly support multiple components of Slack’s webserving infrastructure, including Apache, HHVM, Squid, Memcache, Docker, Kubernetes and AWS services.
- You will collaboratively help support additional software components at Slack that work in conjunction with the webapp. Examples include: Consul, Envoy, HAProxy, Chef, Terraform, databases and caching services
- You will help develop new deployment mechanisms for our webapp infrastructure, such as: canary, A/B, blue/green, red-line and other deployment patterns
- You will lead large engineering projects, from start to finish, where the scope is mostly understood
- You will define SLA/SLOs for the Slack webapp, manage code deployments, fixes and software updates, and automate our operational processes
- This team has an operational responsibility in addition to being a software development team. You will participate in the team’s on-call rotation, assist with triaging, and addressing production issues, and respond to incidents at Slack that involve the webapp.
- You will review code and get your code reviewed; mentor and be mentored by other engineers. Teamwork is what makes the dream work.
What you should have
- Curiosity about how things work and love to share that knowledge with others
- Experience managing critical production infrastructure, maintaining reliability and uptime, and having a “customer first” view of operational safety.
- A positive approach that embraces standard methodologies for software management and reliability, including unit testing, code review, design documentation, debugging, and troubleshooting.
- A passion for reliability, scaling patterns, up-time, and availability.
- A demonstrable history of thriving within a software development team, even if your roles have included traditional operations and/or infrastructure management duties.
- Professional functional or imperative programming languages -- e.g., PHP, Python, Ruby, Go, C, or Java (used without frameworks)
- Strong command of computer science fundamentals: data structures, algorithms, programming languages, distributed systems, and information retrieval
- Bachelor’s degree in Computer Science, Engineering or related field, or equivalent training or work experience
- Experience developing and managing modern public cloud infrastructure, especially AWS
Experience Bonus Points:
- as a Site Reliability Engineer (SRE), or as a platform or infrastructure engineer building and managing reliability mechanisms on distributed infrastructure
- deploying, operating and debugging software on Linux at scale
- hands-on, managing full-stack infrastructure, i.e. networking, storage, virtualization and/or host hardware, configuration management and packaging
- using deployment automation/configuration management, such as Chef, Puppet, Ansible or Salt
- with Incident Response programs and processes; including triaging and resolving production incidents at an organization with challenging SLAs and customer expectations
Slack is registered as an employer in many, but not all, states. If you are not located in or able to work from a state where Slack is registered, you will not be eligible for employment.Visa sponsorship may not be available in certain remote locations.
Visa sponsorship is not available for candidates living outside the country of this position.
The base pay range targeted for this role is $133,833 - $146,000. This base pay range is for illustrative purposes only. This position is eligible for additional compensation and benefits including: incentive compensation; health benefits; flexible spending account; retirement benefits; life insurance; commuter benefits; paid time off (including PTO, emergency time off, paid sick leave, medical leave, volunteer time off, civic duty leave, bereavement leave, floating holidays and paid holidays); parental leave and benefits; mobile phone and internet allowance; perks stipend; and other employee perks and benefits.
The actual offer, reflecting the total compensation package and benefits, will be at the company’s sole discretion, and determined by a myriad of factors including, but not limited to, years of experience, depth of experience, and other relevant business considerations. The company also reserves the right to amend or modify employee perks and benefits at any time.
Explore more DevOps, Cloud and SRE career opportunities
- Open Cloud Infrastructure Architect Jobs
- Open Staff, Product Manager - Global Infrastructure Jobs
- Open IT DevOps Engineer Jobs
- Open Manager of DevOps & Engineering Infrastructure Jobs
- Open Senior Automation Engineer Jobs
- Open Data Platform Engineer Jobs
- Open Site Reliability Engineer II Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Principal Cloud Architect Jobs
- Open Staff DevOps Engineer Jobs
- Open Reliability Engineer Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Sr Software engineer (Infrastructure) Jobs
- Open DevOps Engineer - Raleigh Hub Jobs
- Open Senior Security Automation Engineer Jobs
- Open Software Development Engineer, AWS Security Jobs
- Open QA Automation Engineer - Workforce Engagement Management Jobs
- Open Senior Software Development Engineer, AWS Security Jobs
- Open Senior Devops Engineer Jobs
- Open Cloud DevOps Systems Engineer Jobs
- Open Senior Cloud Architect Jobs
- Open Solutions Architect - VMware Specialist Jobs
- Open MySQL-related jobs
- Open REST-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open Elasticsearch-related jobs
- Open Virtualization-related jobs
- Open High availability-related jobs
- Open Golang-related jobs
- Open Reliability engineering-related jobs
- Open EC2-related jobs
- Open VMware-related jobs
- Open Redis-related jobs
- Open JS-related jobs
- Open MongoDB-related jobs
- Open Node-related jobs
- Open Grafana-related jobs
- Open Gitlab-related jobs
- Open PostgreSQL-related jobs
- Open Jenkins-related jobs
- Open Perl-related jobs
- Open Web applications-related jobs
- Open Spark-related jobs
- Open Load Balancing-related jobs