Site Reliability Engineering

United States - Remote

Rackspace logo


As a cloud computing services pioneer, we deliver proven multicloud solutions across your apps, data, and security. Maximize the benefits of modern cloud.

View all employer listings

Apply now Apply later

Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems
SRE ensures that Rackspace's managed service offerings & customer deployments have reliability and uptime appropriate to users' needs and a fast rate of improvement while monitoring and validating capacity and performance
Focused on reliability, scalability and the development of automation to manage a set of repetitive tasks at scale
Detailed instruction and/ or supervision under the guidance of senior Developers & SREs

You Will:

  • Supports high complexity deployments and internal teams on an as-needed basis. Collaborates with other teams on tools for systems automation
  • Works in conjunction with multiple teams to ensure up-time and reliability of customer deployments

You Have:

  • 2+ years of information systems design/architecture/development experience
  • Experience in one or more of C, C++, Java, Perl, Python, Bash or Go
  • Intermediate experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols e.g., TCP/IP, UDP, ICMP, etc., MAC addresses, IP packets, DNS, SDN, OSI layers, and load balancing
  • Expertise in designing, analysing and troubleshooting large-scale distributed systems
  • Intermediate knowledge of operating systems
  • Familiarity with algorithms, data structures and complexity analysis
  • Intermediate experience designing complex SaaS applications for cloud reliability and scalability
  • Strong experience with GCP, AWS or Openstack APIs
  • Intermediate experience with cloud infrastructure automation and CI/CD pipeline designing
  • Expertise in operational monitoring and management tools (Nagios, Datadog, etc.).
  • Intermediate written & verbal communication skills, both highly technical and non-technical
  • Ability to work closely with non-technical stakeholders and executives
  • Systematic problem-solving approach, coupled with a strong sense of ownership and drive
  • Additional skills may be required depending on the role; for example, Kubernetes, Docker, Terraform, CEPH and other modern tools/technologies


  • High school diploma or equivalent required
  • Bachelor's degree in Computer Science or equivalent experience preferred

About Rackspace TechnologyWe are the multicloud solutions experts. We combine our expertise with the world’s leading technologies — across applications, data and security — to deliver end-to-end solutions. We have a proven record of advising customers based on their business challenges, designing solutions that scale, building and managing those solutions, and optimizing returns into the future. Named a best place to work, year after year according to Fortune, Forbes and Glassdoor, we attract and develop world-class talent. Join us on our mission to embrace technology, empower customers and deliver the future.  More on Rackspace TechnologyThough we’re all different, Rackers thrive through our connection to a central goal: to be a valued member of a winning team on an inspiring mission. We bring our whole selves to work every day. And we embrace the notion that unique perspectives fuel innovation and enable us to best serve our customers and communities around the globe. We welcome you to apply today and want you to know that we are committed to offering equal employment opportunity without regard to age, color, disability, gender reassignment or identity or expression, genetic information, marital or civil partner status, pregnancy or maternity status, military or veteran status, nationality, ethnic or national origin, race, religion or belief, sexual orientation, or any legally protected characteristic. If you have a disability or special need that requires accommodation, please let us know.  Position is available for remote work in the following states unless otherwise specified. Alabama, Arizona, Arkansas, California, Connecticut, Delaware, District of Columbia, Florida, Georgia, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, Wyoming.
Job region(s): Remote/Anywhere North America
Job stats:  4  2  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities