Site Reliability Engineer III (Remote Eligible)
ID.meID.me simplifies how individuals share and prove their identity online. ID.me's next generation platform facilitates identity proofing, authentication, and group affiliation verification for over 500 organizations.
ID.me is simplifying how individuals securely prove and share their identity online. With their secure digital identity network, ID.me is doing for identity what Visa did for financial transactions. ID.me empowers people to fully control their own data through a portable and trusted login so they don’t need to create a new password at each site they visit.
The COVID-19 pandemic has accelerated a massive digital migration for many critical services. These services require a trusted identity to ensure an individual is who they claim to be while keeping out fraud. Identity verification that serves only one organization is costly and time-intensive. Separate passwords for each application add to consumer frustration. With ID.me, login and identity credentials move with an individual so they only need to verify once.
ID.me is a federally-certified identity provider at the highest standards NIST has set for consumer identity verification and login. ID.me is one of only four companies in the United States of America certified by the federal government to bind a legal identity to a digital login.
In addition to providing individuals with complete control over their credentials and data, the company has a “No Identity Left Behind” initiative to expand access and inclusion for all individuals through a video chat verification process. ID.me is passionate about building a robust identity network that does not compromise access for hard-to-identify groups.
We are looking for a Site Reliability Engineer III who will combine software and systems engineering to build and run distributed, fault-tolerant systems at scale. SRE's ensure our services have the appropriate reliability and uptime to protect our customers’ experience.
- Conceive, design, and build infrastructure tooling that improves reliability across the entire product surface area, to improve the availability, scalability, latency, and efficiency of ID.me services
- Manage end-to-end availability and performance of key services and build automation to prevent problem recurrence
- Build visibility into SLIs, SLOs, SLAs, dependency graphs to reduce operational burden
- Implement observability and instrumentation patterns to alert on symptoms to help reduce/prevent outages
- Proactively identify risks and develop engineering process, tooling, or work streams that reduce that risk
- Evangelize and mentor service owners on reliability, resiliency, and scalability for new services and features
- Collaborate with service owners to improve production landscape for existing services
- Facilitate and participate in an on-call rotation and hold retroactive root cause analysis meetings, focusing on identifying remediations using blameless postmortems
- At least 3 years of experience working in medium or large scale production systems
- The ability to take a systematic approach to analyzing, troubleshooting, and diagnosing system problems to identify, locate, resolve, and repair problems
- You can code to automate management of servers and software. When a problem needs a software solution, you roll up your sleeves and get to work
- You design for scale. You manage cattle to avoid snowflakes of systems and applications. You design systems to auto-scale and auto-heal
- You have a breadth of engineering skills with an interest in service reliability, automation, monitoring, and capacity planning.
- Understanding of modern architecture, e.g. micro-services, EDA, etc., and you are cautious against overcomplexity and over-engineering
- You enjoy working with the latest monitoring and metrics platforms, e.g. New Relic, Prometheus, InfluxDB, Grafana, Splunk, etc
- Deep knowledge with AWS technologies, e.g. CLI, Aurora, S3, IAM, EC2, ECS, ECR, KMS, CloudWatch, Lambda, Route53, SQS, SNS, CodeDeploy
- Previous experience working within an SRE culture, improving reliability with automation, chaos testing, and process improvement
- Experience designing and operating distributed systems and cloud infrastructure at scale
- Strong written communication since we are a remote company
- Experience in supporting a 24/7 infrastructure including on-call rotations
Ideal candidate will thrive in the following culture:
- Must have an obsession for building quality products
- Ability to thrive when there are changing priorities and shifting of gears
- Strong oral and written communication skills
- Must be a team player with a strong, self-managing work ethic
- Must be a self-starter with a passion for security engineering, learning and continuous improvement
Note that candidates must be located in the continental U.S.
ID.me Covid Vaccination Requirement
ID.me has a mandatory vaccination requirement where not prohibited by applicable federal or state law.
All current and future employees are required to receive their COVID-19 vaccinations, unless a reasonable accommodation is approved. Employees not in compliance with this policy will be placed on leave and will be terminated if no valid reason for not getting the COVID-19 vaccine is provided.
Purpose: In accordance with ID.me's duty to provide and maintain a workplace that is free of known hazards, we are adopting this policy to safeguard the health of our employees and their families; our customers and visitors; and the community at large from COVID-19 that may be reduced by vaccinations. This policy will comply with all applicable laws and is based on guidance from the Centers for Disease Control and Prevention and local health authorities, as applicable.
Reasonable Accommodation: Current and future employees in need of an exemption from this policy due to a medical reason, or because of a sincerely held religious belief must submit a completed Request for Accommodation form to the human resources department to begin the interactive accommodation process as soon as possible after vaccination deadlines have been announced (September 13th) and an offer of employment has been made. Accommodations will be granted where they do not cause ID.me undue hardship or pose a direct threat to the health and safety of others.
Vision: To be the world's leading digital identity network empowering people to control their own information and to prove their credentials across all channels: online, call center, and in-person.
Mission: To make the world a more trusted place by delivering the highest level of security with the least amount of friction at the lowest possible cost.
People: We have an audacious mission. We aim to fix the identity layer of the internet. Billions of people will live better lives with more trust and convenience thanks to ID.me. We are like Special Forces. We take on the most difficult challenges with amazing teammates.
ID.me Core Values: *Don't be a jerk. *Always compete. *Ask questions like a 5-year old. *Inspire people with your passion. *Make something better every day. *Treat each customer like your favorite family member. *Own your mistakes so you can learn from them. *Details are everything. *Communicate like a scientist. *Be truthful (even when it's hard). *Reflect ID.me's values in your actions. *Act like an owner.
Explore more DevOps, Cloud and Digital Infrastructure career opportunities
- Open Database Administrator jobs
- Open Cloud Automation Engineer jobs
- Open Senior Software Engineer - Site Reliability jobs
- Open Senior Cloud Security Engineer jobs
- Open Senior Test Automation Engineer jobs
- Open IT DevOps Engineer jobs
- Open Manager of DevOps & Engineering Infrastructure jobs
- Open Linux Infrastructure Developer jobs
- Open Staff, Product Manager - Global Infrastructure jobs
- Open Junior DevOps Engineer jobs
- Open Senior Cloud Infrastructure Engineer jobs
- Open Lead Site Reliability Engineer jobs
- Open Lead DevOps Engineer jobs
- Open Senior Site Reliability Engineer (SRE) jobs
- Open Senior Software Engineer DevOps (remote) jobs
- Open Reliability Engineer jobs
- Open Staff Platform Engineer jobs
- Open Senior Infrastructure Security Engineer jobs
- Open Staff DevOps Engineer jobs
- Open Data Infrastructure Engineer jobs
- Open Senior Automation Engineer jobs
- Open Software Engineer, Data Infrastructure jobs
- Open DevOps/Configuration Management Specialist jobs
- Open Database Reliability Engineer jobs
- Open Database Engineer jobs
- Open Kafka-related jobs
- Open REST-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open Unix-related jobs
- Open Elasticsearch-related jobs
- Open DNS-related jobs
- Open S3-related jobs
- Open Golang-related jobs
- Open PowerShell-related jobs
- Open Jira-related jobs
- Open TCP-related jobs
- Open Grafana-related jobs
- Open Redis-related jobs
- Open High availability-related jobs
- Open EC2-related jobs
- Open JS-related jobs
- Open TCP/IP-related jobs
- Open Virtualization-related jobs
- Open Node-related jobs
- Open MongoDB-related jobs
- Open VMware-related jobs
- Open PostgreSQL-related jobs
- Open Gitlab-related jobs