Staff Software Engineer, ML Infrastructure

Remote - US

ASAPP logo
Apply now Apply later

Posted 1 month ago

At ASAPP, we are on a mission to build transformative machine learning-powered products that push the boundaries of artificial intelligence and customer experience. We focus on solving complex, data-rich problems — the kind where there are huge systemic inefficiencies and where a real solution will have a significant economic impact. Our CX performance platform uses machine learning across both voice and digital engagement channels to augment and automate human work, radically increasing productivity and improving the efficiency and effectiveness of customer experience teams.
We are looking for a new member to join our Machine Learning team as a Staff Machine Learning Engineer focused on ML Infrastructure. You should have the passion to tackle tough problems by bringing your expertise to ASAPP to help us solve cutting edge machine learning and natural language processing (NLP) challenges.
As a part of our team, you will work with our large-scale machine learning systems for training, and deploying hundreds or thousands of cutting-edge models at the same time. You’ll work closely with our researchers, site reliability engineers, data engineers, and fellow machine learning engineers to evolve our data pipelines and management to accelerate the research-to-production process. You will be part of the team responsible for the design and development of our next generation production experimentation & AB Testing platform, and constantly engage with both new engineering challenges & the latest in ML research.
As our operations keep growing, we encourage applicants from all locations in the US to apply. 

What you'll do

  • Help us turn research into meaningful ML products that help tens or hundreds of millions of users
  • Design, build, and maintain model training systems that leverage and enable cutting-edge developments in training machine learning models at scale
  • Actively work on creating and improving tools to parallelize model training, unifying dataset creation and accuracy measurements across experiments
  • Collaborate with and mentor engineers and researchers to help them maximize the speed and efficiency in model research and training
  • Actively follow advancements in AI and ML, and participate in discussions about them with the ML and research teams

What you'll need

  • Minimum of 5 years experience working on distributed computing projects
  • Production experience with Machine Learning and/or Natural Language Processing
  • Desire to learn new things, work closely with peers from different teams, and help others
  • Production experience with modern cloud computing management (AWS, Kubernetes, Docker, etc.)
  • Experience with at least one major programming language (Python, Go, Java, etc.)

What we'd like to see

  • Proficiency in deep-learning frameworks like TensorFlow or PyTorch  and/or experimentation and training frameworks, such as PyTorch-Lightning
  • Experience working with distributed computing technologies (for example: Hadoop, Spark, Airflow, Ray)
  • Experience in big-data, ETL, or large-scale data science
  • Knowledge of additional languages such as Python, Go, Scala, Javascript, or Typescript are not necessary but will help you work in cross-functional teams
  • Be passionate about something we don’t already have expertise in!

Benefits & Perks

  • Competitive compensation with stock options
  • Comprehensive medical, vision, and dental insurance
  • 401k matching
  • Fitness and wellness stipend
  • Mental well-being benefits
  • Professional learning and development stipend
  • Parental leave, including adoptive and foster parents
  • 3 weeks paid time off (increases with tenure) and unlimited sick leave
ASAPP is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, disability, age, or veteran status. If you have a disability and need assistance with our employment application process, please email us at to obtain assistance. #LI-DNI
Job tags: Airflow AWS Docker Go Hadoop Java JavaScript Kubernetes Python Scala Spark
Job region(s): North America Remote/Anywhere
Job stats:  3  0  0
  • Share this job via
  • or

More DevOps and Cloud position highlights