Senior Systems Reliability Engineer

Remote, United States

Full Time Senior-level / Expert
Netflix logo
Netflix
Apply now Apply later

Posted 1 week ago

The Creative Compute and Storage team designs, develops and delivers technology infrastructure globally for the evolving needs of our creatives. As we continue to expand our content creation globally, we are looking for the best and brightest engineering talent to be part of our growth.  Our team is looking for a Senior Systems Reliability Engineer to aid in the development of our purpose built infrastructure platforms with a focus on operability and high availability. You will work with internal engineering teams and external vendors around the world to deliver amazing technology experiences for our creative users. We are looking for an experienced engineer that brings a broad set of technical skills, a software/automation focused mindset to solving complex distributed systems problems.
Be sure to review our culture page and long-term view to learn more about the unique Netflix culture and the opportunity to be part of our team.

As a member of the team you will...

  • Drive continual improvement in monitoring, configuration management, instrumentation and automation with the primary goal to maintain highly scalable and reliable services worldwide
  • Knowledge of networking concepts and application protocols, especially TCP/IP, BGP, HTTP/S and DNS
  • Some experience with distributed analytic processing technologies (Hadoop, Hive, Pig, Presto, MapReduce, etc)
  • Expert-level knowledge of Windows/Linux system administration at scale. 

About you.

  • Preferred - BS in Computer Science, Electrical Engineering or Computer Engineering (or equivalent professional experience)
  • Service Reliability/Operational experience running large scale high performance systems & Internet services
  • You can write reliable and understandable code in Python or other languages.
  • You aim to always be learning new things, working in new spaces and share this passion with those around you
  • Ability to work in a highly collaborative environment and to communicate effectively with internal and external partners
  • Handle Tier 3 escalation for production issues

Job tags: Hadoop High availability High performance Linux Python Windows
Job region(s): North America Remote/Anywhere
Job stats:  3  1  0
  • Share this job via
  • or

More DevOps and Cloud position highlights