Site Reliability Engineering - Senior Database Operations Engineer
As Addepar's SRE Database Administrator, you'll be responsible for ensuring the availability and reliability of Addepar's core production databases. You'll play a leading role in implementing and maintaining best practices for database administration. Addepar moves quickly, so it's vital that you ensure our databases are running at optimum speed and efficiency. This position involves critical duties and responsibilities that must continue to be performed during crises and contingency operations, which may necessitate extended hours of work. In this position, your responsibility will include system design, configuration, deployment, and operations of Database operations and Database observability systems and tools. These systems include monitoring services and infrastructure, log collection and analytics, and Database Performance Monitoring (DPM). Together these systems and tools serve as a critical part of Addepar’s Cloud infrastructure services. The ideal candidate will need to have strong cloud platform, AWS Database Services, MySQL, PostgreSQL, Linux systems and DBAOps experience, and demonstrate experience with database architecture and production workloads at scale.
- Assuming administrative and operational ownership of Addepar's self-managed and AWS-based Database services, including MySQL and PostgreSQL.
- Highly skilled at database design, installations, conversions responsible for database backup and recovery procedures, access security and database integrity, data storage design, and data storage management.
- Experienced in Database Performance, Maintenance (DPM) and Development to perform the following: Database management including backups, recovery, and maintenance; Database Disaster Recovery planning, testing, execution and documentation; Database Security policies, access control, and security settings; Database management for export, import, and mass data loads; Analyzes and generation of all tables and index statistics; Database monitoring and resolving bottlenecks and data lock issues, resolve issues on time; Check all statistics related to memory, locks, I/O, database response time, wait for events, top sessions, and background processes; On-Line Transaction Processing (OLTP) to capture, validate and store large amounts of transactions; Database guidance to engineers and participate in peer reviewing pull requests related to SQL code.
- Create, operate and deliver KPI-focused database monitoring solutions and reports using enterprise-grade Database observability methodologies and risk management processes.
- On-board Addepar Databases and critical applications to Enterprise Observability tools and services across Addepar to enable monitoring and alerting with best practices and standard Database Performance Monitoring (DPM) tools
- Troubleshooting and resolving database performance-related issues
- Establish customized monitoring dashboards, thresholds, alerting to enable Addepar application and support teams fully
- Establish document and maintain Database observability program including monitoring tools, support run books, and documentation to ensure proper monitoring in production is continuous
- Serve as a subject matter expert on Database Performance Monitoring (DPM), logging, and other observability & visualization tools
- Performance expert consultation and training services to our application development and platform support partners
- Supporting the high availability and disaster recovery options of Addepars Database instances - AlwaysOn Availability and Database Replication are in place and active
- Developing automated scripts for maintenance tasks
- Build, Own, Update and maintain all Database topology and architecture diagrams and documentation.
- Maintain, develop and report on metrics relative to Critical Incident Response Team activities for monthly business and flash reporting for senior leadership and internal business units
- Work with a team of experienced engineers to test your ideas and understand the system, and mentor junior team members
- Build and maintain successful relationships with existing and prospective members, ensuring end-to-end observability of Addepars platform
- Excellent problem solving and critical thinking skills, and ability to function and communicate under pressure
- Participate and lead efforts involving incident response and root cause analysis for Site Reliability Engineering
- Being on-call and available for rotational shift support for production issues
- Demonstrate outstanding communication, flexibility, teamwork, and leadership
- Participate, present, and speak to KPI's metrics and uptime performance data in management and executive-level debriefs.
Knowledge & Skills
- 10+ years of experience in Database engineering or Database Administration.
- Developing pipelines using CICD tools: Github Actions and/or Jenkins is a plus.
- Linux/*NIX administration.
- Public cloud providers: AWS, Azure.
- Programming experience in Java, C++, Python, Go, or Deep experience managing large-scale software and distributed systems and environments.
- An understanding of and experience with web application development.
- A solid foundation in computer science, with competencies in data structures, algorithms, and software design practices.
- Understand database design, caching, scalability, and network fundamentals.
- 5+ years of experience with Docker, Kubernetes, Sensu, Prometheus, or other CNCF software is a big plus.
- An understanding of and experience with incident alerting platforms, Pagerduty
- Technical documentation skills, produced in a step by step manner, regarding every task.
- BS, or MS degree in Computer Science or related technical field or equivalent industry experience.
- An understanding of and experience in Product/Project management and issue tracking systems (Jira, Smartsheets, Aha a plus)
- Demonstrated ability in leadership, facilitation, collaboration, and negotiation.
- Demonstrated ability to operate in a highly matrix environment and align to common vision, objectives, and outcomes.
- Ability to translate complicated technical concerns to non-technical individuals.
- AWS Certified Database - specialty
- AWS Certified Solutions Architect / Professional is a plus
Addepar is a wealth management platform that specializes in data aggregation, analytics, and reporting for even the most complex investment portfolios. The company’s platform aggregates portfolio, market, and client data all in one place. It provides asset owners and advisors a clearer financial picture at every level, allowing them to make more informed and timely investment decisions. Addepar works with hundreds of leading financial advisors, family offices, and large financial institutions that manage data for over $2 trillion of assets on the company’s platform. In 2020, Addepar was named as a Forbes Fintech 50 and in 2018 received Morgan Stanley’s Fintech Award for making a significant impact on the firm’s mission of continuous innovation. Addepar is headquartered in Silicon Valley and has offices in New York City and Salt Lake City.
Addepar is an equal opportunity employer. We’re committed to building together and to do that best, we rely on a range of backgrounds, experiences, and ideas.
In order to ensure the health and safety of all Addepeeps and our prospective candidates, we have instituted a virtual interview and onboarding experience.