Product Reliability Engineer

Denver, CO

Palantir Technologies logo
Palantir Technologies
Apply now Apply later

Posted 6 days ago

A World-Changing Company
At Palantir, we’re passionate about building software that solves problems. We partner with the most important institutions in the world to transform how they use data and technology. Our software has been used to stop terrorist attacks, discover new medicines, gain an edge in global financial markets, and more. If these types of projects excite you, we'd love for you to join us.
The Role
Product Reliability Engineers (PREs) are the driving forces of stability across Palantir’s products. Product Reliability Engineers help to ensure our products are available 24/7. When something goes wrong, Product Reliability Engineers are the first to respond and are responsible for triaging, troubleshooting, and coordinating the resolution of the issue.
Every day at Palantir is different: we’re constantly evolving to better respond to customer needs, and as a PRE you will embed with our engineering and business teams to minimize risks associated with the deployment of our products. You are a resourceful, creative, and agile problem solver who is able to work both collaboratively and independently to resolve the most difficult and nebulous technical issues. This includes creating product health metrics and automated alerts, fixing product bugs, and developing and documenting strategies for responding to incidents.
Whatever the technical issue or question about Palantir’s products is, you’ll play a central and critical role in resolving it — seeking not just a one-time fix, but a permanent solution. 

Example Projects

  • Product Observability: PREs embed with Palantir’s development teams to build and refine: metrics, logging, monitoring, and alerting for our products to enable proactive (and increasingly automated) issue identification, prevention, and remediation. This serves to improve the performance and uptime of our products, as well as provide detailed telemetry that aids in debugging complex issues (e.g., building monitoring to understand when a new feature is not working as expected or when a new release has introduced a product regression).
  • Empowering users to troubleshoot their own issues: Because PREs own their product’s stability work, they are uniquely qualified to identify areas in the product where users are unable to easily understand why something goes wrong when it does. PREs make this easier by building self-serve tools/processes that allow product users to better understand what the blockers are (e.g., whether a build failing is access-related or a product bug).
  • Product Reliability Systems and Process: PREs are the amongst the most seasoned users of our deployment and stability infrastructure, which means they often identify opportunities for additional functionality that would improve operational efficiency. An important part of the PRE role is partnering with other infrastructure and operational teams across Palantir to provide this input and, where appropriate, directly deliver features that will benefit product operations. (e.g., developing a system that automatically de-duplicates product alerts and enables teams to prioritize and document critical information).

Core Responsibilities

  • Develop a deep understanding of Palantir's products and processes.
  • Collaborate with customer-facing, product, and infrastructure teams on the development and deployment of scalable, reliable software for our customers.
  • Diagnose, resolve, and prevent issues encountered in the field
  • Reduce the operational overhead of Palantir’s products and leverage data to understand the largest sources of reliability risk.
  • Deliver end-to-end improvements to stability by proactively preventing issues via telemetry and automation and directly reducing the need for reactive support.
  • Make data-driven decisions about investments in stability and reliability.
  • Take part in a 24/7 on-call rotation responsible for coordinating Palantir’s response to mission-critical incidents, ensuring efficient resolution with minimal customer impact.

What We Value

  • Background in Computer Science, Engineering, Information Systems, or other technical field.
  • Excellent problem solving skills, ability to break down and explain complex concepts, and strong attention to detail.
  • Comfort working in a fast moving environment with dynamic objectives that require creative thinking to address product and customer needs.
  • Ability to work both independently and make decisions under minimal supervision, as well as collaborate as part of a team.
  • Experience coding with Java, Go and/or web technologies (e.g. HTML, CSS, JavaScript, Python/Ruby, Django/Flask/Ruby on Rails, etc.) is a plus.
  • Experience with distributed computing systems and/or cloud infrastructures (e.g. Spark, Hadoop, YARN, Kubernetes, AWS, etc.) is a plus.
  • Willingness and interest to travel to other Palantir locations as needed.

Benefits

  • Medical, dental, and vision insurance.
  • Life and disability coverage.
  • Paid leave for new parents and emergency back-up care for all parents.
  • Family planning support, including fertility, adoption, and surrogacy assistance.
  • Stipend to help with expenses that come with a new child.
  • Commuter benefits.
  • Relocation assistance.
  • Unlimited paid time off.
  • 2 weeks paid time off built into the end of each year.
Our benefits aim to promote health and wellbeing across all areas of Palantirians’ lives. We work to continuously improve our offerings and listen to our community as we design and update them. The list above details our available benefits and some of the perks that can be enjoyed as an employee of Palantir Technologies.
Salary
The starting salary for this position is estimated to be $82,000/year. Total compensation includes approximately $8,000/year of equity compensation in the form of Restricted Stock Units, based on the recent value of the company’s stock at the time of grant. Further note that total compensation for this position will be determined by candidate’s relevant qualifications, work experience, skills, and other factors. This estimate excludes the value of any potential sign-on bonus; the value of any benefits offered; and the potential future value of any long-term incentives.
Palantir is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. Palantir is committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. Please see the United States Department of Labor's EEO poster and EEO poster supplement for additional information.
Job tags: AWS CSS Django Go Hadoop HTML Java JavaScript Kubernetes Python Rails Ruby Spark
Job region(s): North America
Job stats:  0  0  0
  • Share this job via
  • or