Sr. Site Reliability Engineer

Hawthorne, CA, United States

Full Time Senior-level / Expert
SpaceX logo
SpaceX
SpaceX designs, manufactures and launches advanced rockets and spacecraft. The company was founded in 2002 to revolutionize space technology, with the ultimate goal of enabling people to live on other planets.
Apply now Apply later

SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.

SR. SITE RELIABILITY ENGINEER, GROUND SEGMENT

The SpaceX Ground Segment team is responsible for supporting a wide array of communications needs. They support and deploy our worldwide network of ground stations, own all launch site and recovery RF communications and optical tracking, and facilitate all Starlink Telemetry, Tracking, and Control (TT&C). For all real-time operations of Falcon, Dragon, and Starship, the team facilitates NASA flight operations interfaces and network integration, as well as operational voice, mission control video, and communications readiness testing.

The successful candidate will drive the design and operations of our ground station network, operational systems, and partner interfaces (such as NASA) with an unrelenting drive toward reliability and automation. This network is critical to operations throughout all phases of a mission, providing command, telemetry, video, and voice communications. This role will be critical as the team develops the infrastructure to support lunar and Mars missions.

The ideal candidate will be flexible, possess broad skills across hardware installation, networking, monitoring, and software deployment/development, and flourish in a fast-paced and challenging environment. You should be a self-starter that is able to take initiative and approach traditional problems in novel ways.

RESPONSIBILITIES:

  • Deploy, upgrade, operate, maintain, and scale our suite of mission critical products and services
  • Own and improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
  • Practice sustainable incident response and blameless postmortems
  • Manage the underlying infrastructure in collaboration with IT and InfoSec
  • Drive scripting and automation to develop solutions to common problems
  • Design and automate test cases, write test plans, and develop supporting test tools
  • Own regular regression tests on a periodic basis to ensure performance levels of production systems
  • Install and configure servers, network equipment, and other IT datacenter systems

BASIC QUALIFICATIONS:

  • 8+ years of site reliability, system administrator, or dev ops experience in a Linux environment
  • Experience with Puppet, Ansible, or other automation frameworks
  • Scripting/programming experience in shell bash, Python, and/or other languages
  • Networking knowledge of TCP/IP
  • Experience with source code and version control tools such as Subversion or Git

PREFERRED SKILLS AND EXPERIENCE:

  • Bachelor's degree in computer science, information systems/IT or engineering
  • 8+ years of systems administration, site reliability engineering, or dev ops experience
  • Experience with managing dozens to hundreds of servers
  • Knowledge of software engineering practices including continuous integration, configuration management, build optimization, build automation, and deployment
  • Ability to address and resolve information technology issues promptly, effectively, and independently
  • Experience with alarm and monitor systems such as Nagios, Icinga, Zabbix, Ops Genie, Grafana
  • Experience with workflow and issue management tools such as JIRA
  • Comfortable working with mission critical and sensitive systems with a sense of urgency appropriate to the responsibilities

ADDITIONAL REQUIREMENTS:

  • Able to work extended hours and weekends as needed
  • Willing to travel to other SpaceX offices and global ground station locations as needed

ITAR REQUIREMENTS:

  • To conform to U.S. Government space technology export regulations, including the International Traffic in Arms Regulations (ITAR) you must be a U.S. citizen, lawful permanent resident of the U.S., protected individual as defined by 8 U.S.C. 1324b(a)(3), or eligible to obtain the required authorizations from the U.S. Department of State. Learn more about the ITAR here.  

SpaceX is an Equal Opportunity Employer; employment with SpaceX is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.

Applicants wishing to view a copy of SpaceX’s Affirmative Action Plan for veterans and individuals with disabilities, or applicants requiring reasonable accommodation to the application/interview process should notify the Human Resources Department at (310) 363-6000.

Job perks/benefits: Flex hours
Job region(s): North America
Job stats:  14  1  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities