Staff Software Engineer, Data Infrastructure
Buenos Aires
Full Time

ASAPP
At ASAPP, we’re working to solve complex and challenging problems by building beautiful and useful machine learning powered products. We leverage artificial intelligence to address significant challenges that share three common characteristics: gigantic economic size, systemic inefficiency, and involve large amounts of data. Our talented teams that drive the innovation and development of our products are in New York City, San Francisco, and Buenos Aires.
The Data Engineering team at ASAPP designs, builds and maintains our mission-critical core data infrastructure and analytics platform. Accurate, easy-to-access, and secure data is critical to our natural language processing (NLP) customer interaction platform which interacts with tens of millions of end-users in real-time. We’re looking to hire a data engineer with the knack for building out data infrastructure systems that can handle our ever-growing volumes of data and the demands we want to make of it. Automation is a key part of our workflow, so you’ll help design and build highly-available data processing pipelines that self-monitor and report anomalies. You’ll need to be an expert in ETL processes and know the in’s and out’s of various data stores that serve data rapidly and securely to all internal and external stakeholders. As part of our fast-growing data engineering team, this role will also play an integral role in shaping the future of data infrastructure as it applies to improving our existing metric-driven development and machine learning capabilities.
The Data Engineering team at ASAPP designs, builds and maintains our mission-critical core data infrastructure and analytics platform. Accurate, easy-to-access, and secure data is critical to our natural language processing (NLP) customer interaction platform which interacts with tens of millions of end-users in real-time. We’re looking to hire a data engineer with the knack for building out data infrastructure systems that can handle our ever-growing volumes of data and the demands we want to make of it. Automation is a key part of our workflow, so you’ll help design and build highly-available data processing pipelines that self-monitor and report anomalies. You’ll need to be an expert in ETL processes and know the in’s and out’s of various data stores that serve data rapidly and securely to all internal and external stakeholders. As part of our fast-growing data engineering team, this role will also play an integral role in shaping the future of data infrastructure as it applies to improving our existing metric-driven development and machine learning capabilities.
What you'll do
- Design and deploy improvements to our mission-critical production data pipeline and data warehouse
- Recognize data flow patterns and generalizations to automate as much as possible to drive productivity gains
- Expand our logging and monitoring processes to discover and resolve anomalies and issues before they become problems
- Develop state-of-the-art automation and data solutions in Python and Spark
- Increase the efficiency, accuracy, and repeatability of our ETL processes
- Know how to make the tradeoffs required to ship without compromising quality
What you'll need
- 5+ years experience in a data engineering and/or quantitative data-driven role
- Expertise in at least one flavor of SQL, e.g. Redshift, Postgres, MySQL, Presto, Spark SQL, Hive
- Proficiency in a high-level programming language. We use Python, Scala, and Go
- Experience with CI/CD (continuous integration and deployment)
- Experience with workflow management systems such as Airflow, Oozie, Luigi, and Azkaban
- Experience implementing data governance, i.e. access management policies, data retention, etc.
- Confidence operating in a devops-like capacity working with AWS, Kubernetes, Jenkins, Terraform, etc. thinking about automation, alerting, monitoring, and security and other declarative infrastructure
What we'd like to see
- A Bachelor’s Degree in a field of science, technology, engineering, or math, or equivalent hands-on experience
- Experience in applying data engineering to machine learning model training
- Familiarity handling Kubernetes clusters for various jobs, apps, and high throughput
- Technical knowledge of data exchange and serialization formats such as Protobuf, Avro, or Thrift
- Experience in either deploying or developing with Spark
Perks
- Competitive compensation
- Stock options
- OSDE 410 for the family group
- Wellness perks
- Mac equipment
- 3 weeks vacation
- Training and development
- English lessons
Job tags:
Airflow
AWS
CD
CI
Go
Kubernetes
MySQL
Postgres
Python
Redshift
Scala
Spark
SQL
Terraform
Job region(s):
South America