Software Engineer - Data Infrastructure

San Francisco

Plaid Inc. logo
Plaid Inc.
Apply now Apply later

Posted 2 weeks ago

We believe the way people interact with their finances will drastically improve in the next few years. We’re dedicated to empowering this transformation by building the tools and infrastructure developers need to create their own products. Today, thousands of companies such as Acorns, Stripe, and Venmo rely on Plaid to connect with the financial system; supporting millions of requests per day and thousands of bank integrations. Our goal is to give people more control over their finances and unlock financial freedom for everyone.
Making data-driven decisions is key to Plaid's culture. To support that, we need to scale our data systems while maintaining correct and complete data. We provide tooling and guidance to teams across engineering, product, and business and help them explore our data quickly and safely to get the data insights they need, which ultimately helps Plaid serve our customers more effectively. In addition, Plaid will not be successful if we can't move quickly. We build the data and machine learning infrastructure to enable Plaid engineers to prototype and iterate on products and features built on top of consumer-permissioned financial data.
Engineers on the Data Infrastructure teamwork on the three workstreams within the team (analytics infra, production data infra, ML platform) to scale our existing data pipelines and build the various pieces of ML platform from the ground up. 
We work in Python, Golang, and Typescript. Our systems are built on top of Docker, Kubernetes, Sagemaker, Spark, S3, Redshift, Airflow, and ElasticSearch.
Our engineering culture is IC-driven -- we favor bottom-up ideation and empowerment of our incredibly talented team. We are looking for engineers who are motivated by creating impact for our consumers and customers, growing together as a team, shipping the MVP, and leaving things better than we found them.

What excites you

  • Defining the long-term technical roadmap for machine learning and data-driven iteration at Plaid
  • Leading key data infrastructure projects across multiple work streams, such as building an internal ML-as-a-service platform and an incremental stream processing pipeline
  • Working with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid 
  • Mentoring other engineers into senior roles and establishing a culture of technical excellence

What excites us

  • 2+ years of production experience, core fundamentals, and a strong command of at least one language
  • Production experience building out data systems that make it a breeze to ingest, process, and analyze terabytes of data
  • Empathy and enthusiasm for understanding other teams’ challenges and the ability to influence them towards the right technical path for the org
  • Knowledge of Spark, Hadoop, Airflow, or other data infrastructure tools



Plaid is proud to be an equal opportunity employer and values diversity at our company. We do not discriminate based on race, color, national origin, ethnicity, religion or religious belief, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, military or veteran status, disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state, and local laws. Plaid is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance with your application or interviews due to a disability, please let us know at accommodations@plaid.com.
Job tags: Airflow Docker Elasticsearch Golang Hadoop Kubernetes Python Redshift S3 Spark
Job region(s): North America
Job stats:  0  0  0
  • Share this job via
  • or