Senior Infrastructure Engineer

NYC (remote)

Full Time Senior-level / Expert
Clarifai Inc. logo
Clarifai Inc.
Apply now Apply later

About the Company:

Clarifai is a leading, full-lifecycle deep learning AI platform for computer vision and natural language processing. We help organizations transform unstructured images, video, and text data into structured data at a significantly faster and more accurate rate than humans would be able to do on their own. Founded in 2013 by Matt Zeiler, Ph.D. Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with employees remotely based throughout the United States and in Tallinn, Estonia.

We have secured $40M in funding up to date, backed by Menlo Ventures, Google Ventures, USV, NVIDIA, Qualcomm, Osage, Lux Capital, LDV Capital, and Corazon Capital.  

Clarifai is proud to be an equal opportunity workplace dedicated to pursuing, hiring, and retaining a diverse workforce.

Your Opportunity:

We build the systems and services that other engineers use to build the Clarifai products. Neural networks are data-hungry beasts and we work to keep them well-fed. Clarifai’s infrastructure team is responsible for the overall availability and reliability of the products we sell. We work with the other engineering teams to ensure they have the tools and resources they need to deliver the best product for our customers. 

Your Impact:

Your focus on engineering efficiency will empower the engineers around you to deliver products faster than ever before while your knowledge of service reliability best practices will help Clarifai scale to meet growing customer demand. 

What You’ll Do:

  • You’re a doer with the ability to drive projects forward.
  • Monitor and maintain the stability and scalability of our products and work with teams to identify and resolve potential bottlenecks.
  • Develop and maintain the internal services that power our infrastructure.
  • Provide the tools and platforms used by our engineers to build and maintain our products.

What You Bring:

  • Proficiency in Python, Golang, C++, or Java.
  • You are well-versed in observability and reliability best practices.
  • Experience deploying container orchestration (e.g. Kubernetes, GKE, EKS, et al.)
  • Experience debugging and operating common cloud datastores (RDS, Cloud SQL, Redshift, et al.) or their open source alternatives.
  • You are an expert with CI/CD pipelines.
  • You have experience with distributed computing and storage (e.g. Hadoop, Spark, HDFS, Ceph, et al.)
  • You’re comfortable with the entrepreneurial pace of successful start-up.
  • You’re comfortable with being the most senior person in the room to provide infrastructure support to other teams at times.

Would Be Nice If You Also Have:

  • Experience using GPUs for computation.
  • Experience with hybrid (cloud and on-premise) infrastructure deployments.
  • Experience working on longer term initiatives with pre-defined achievements and deliverables (for example government projects)


In your first month, you will start off by learning the ropes. You will:

  • Develop a mental model of the existing services that make up Clarifai’s production stack.
  • Begin to tackle tasks which expose you to different pieces of our infrastructure.
  • Work with peers to find which tasks are the highest priority and then deliver on them.

3 months later, you will use your understanding of Clarifai’s infrastructure to find the critical areas to address. You will:

  • Use your expertise to actively identify areas the team should be addressing within our existing infrastructure.
  • Work with other engineering teams to deliver infrastructure improvements for current product initiatives.

6 months down the road, you'll have an understanding of current and future projects that will allow you to scale our products and our engineering practices. You will:

  • Work on addressing critical infrastructure issues, preventing it from becoming a future postmortem.
  • Work with other engineering teams to deliver infrastructure improvements for future product initiatives, empowering those teams to deliver on their schedule.
  • Scale our engineering standard methodologies, enhancing the efficiency of your peers through better processes, tools, or both.

In 12 months, your deep understanding of our product and infrastructure will be critical in determining the future vision for our infrastructure. You will:

  • Continue to scale our existing tools and platforms for a rapidly growing engineering organization and product suite.
  • Identify large future initiatives the infrastructure layer should be focusing on, ensuring we remain ahead of any scaling bottlenecks and product requirements.

In the future, you’ll continue to ensure our products scale to meet growing customer demand and our engineering team has the tools and platforms they need to deliver new products faster than before.


Job region(s): Remote/Anywhere North America
Job stats:  4  0  0
  • Share this job via
  • or

Explore more DevOps, Cloud and SRE career opportunities