Senior Site Reliability Engineer, Infrastructure

Bothell, WA; Austin, TX; San Diego, CA; Santiago, Chile; Vina Del Mar, Chile

Evernote logo
Apply now Apply later

Posted 4 weeks ago


Senior Site Reliability Engineer, Infrastructure


Our SRE team is responsible for the overall performance and reliability of Evernote’s service and products. This includes over 200 million passionate and engaged users around the world, with billions of notes and files.

The Infrastructure Engineering team in SRE creates resilient and scalable compute, network, storage, and database systems that serve as the foundation of the Evernote service. We provide our engineering teams with platforms to run the software features that delight our users. As a Senior Site Reliability Engineer, you will contribute to the ongoing mission of delivering an exceptional service to our users.


What you’ll do

  • You will research and analyze new technology to solve problems at all layers of our stack
  • You will partner closely with engineering teams to maintain and scale our platforms
  • You will own the development of technical standards for new services that ensure success in production environments
  • You will publish internal design documentation and procedures that provide detailed specifications for the engineering audience
  • You will develop software and maintain automation systems to reduce toil and to run our infrastructure at scale
  • You will design and implement secure solutions with our Security team to protect our users’ data
  • You will champion our SLOs and continuously improve them
  • You will act as a subject-matter expert for critical infrastructure and provide mentorship for the department in those areas
  • You will participate in an on-call rotation to help maintain the availability of our service so that users always have access to their data


What we’re looking for

  • You take initiative and lead by example to motivate your peers
  • You focus on quality to build resilient, scalable, and maintainable systems
  • You make decisions based on data and exercise judgement to balance risks and rewards
  • You partner with your teammates and thrive in a collaborative environment to tackle challenging technical problems
  • You share enthusiastically with your colleagues and provide strong mentorship


What you’ve done

  • You have 6 or more years of experience running a large-scale, online web service
  • You know Linux systems like the back of your hand and mastered the fundamental TCP/IP networking protocols (e.g. HTTP, DNS, etc)
  • You have deployed Kubernetes and cloud-native infrastructure and worked with product teams to launch and run microservices in production
  • You have experience with distributed web applications and service mesh platforms
  • You have integrated and used third-party metrics and monitoring platforms such as Datadog and Pagerduty
  • You have successfully deployed configuration management and orchestration tools
  • You have developed extensible and maintainable automation and written software that makes an SRE’s job easier


Skills that are particularly meaningful to us

  • Google Cloud Platform: VPC networking, GCE, GKE, GCS, PubSub, Spanner, GCS, App Engine, BigQuery, BigTable
  • AWS: EC2, S3
  • Monitoring: Pagerduty, Datadog, Splunk
  • Tools: Ansible, Puppet, Helm, Jenkins, Cloud Deployment Manager, Terraform
  • Infrastructure: Kubernetes, HAProxy, Envoy, Elasticsearch, Consul, Istio, Vault
  • Languages/Libraries: Python, Java, Go, Thrift, Node.js, gRPC


** Please submit your CV in English only **  

We are committed to an inclusive and diverse Evernote. We believe that different perspectives lead to better ideas, and better ideas allow us to better understand the needs and interests of our diverse, global Evernote Community. We welcome people of different backgrounds, experiences, abilities and perspectives and are an equal opportunity employer.

California privacy notice: Read our privacy policy for job applicants here.

Job tags: Ansible AWS EC2 Elasticsearch Go Google Cloud Platform Java JS Kubernetes Linux Node Node.js Puppet Python S3 Terraform Vault Web applications