Senior Infrastructure Engineer
About the Company:
Clarifai is a leading, full-lifecycle deep learning AI platform for computer vision and natural language processing. We help organizations transform unstructured images, video, and text data into structured data at a significantly faster and more accurate rate than humans would be able to do on their own. Founded in 2013 by Matt Zeiler, Ph.D. Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with employees remotely based throughout the United States and in Tallinn, Estonia.
We have secured $40M in funding up to date, backed by Menlo Ventures, Google Ventures, USV, NVIDIA, Qualcomm, Osage, Lux Capital, LDV Capital, and Corazon Capital.
Clarifai is proud to be an equal opportunity workplace dedicated to pursuing, hiring, and retaining a diverse workforce.
We build the systems and services that other engineers use to build the Clarifai products. Neural networks are data-hungry beasts and we work to keep them well-fed. Clarifai’s infrastructure team is responsible for the overall availability and reliability of the products we sell. We work with the other engineering teams to ensure they have the tools and resources they need to deliver the best product for our customers.
Your focus on engineering efficiency will empower the engineers around you to deliver products faster than ever before while your knowledge of service reliability best practices will help Clarifai scale to meet growing customer demand.
What You’ll Do:
- You’re a doer with the ability to drive projects forward.
- Monitor and maintain the stability and scalability of our products and work with teams to identify and resolve potential bottlenecks.
- Develop and maintain the internal services that power our infrastructure.
- Provide the tools and platforms used by our engineers to build and maintain our products.
What You Bring:
- Proficiency in Python, Golang, C++, or Java.
- You are well-versed in observability and reliability best practices.
- Experience deploying container orchestration (e.g. Kubernetes, GKE, EKS, et al.)
- Experience debugging and operating common cloud datastores (RDS, Cloud SQL, Redshift, et al.) or their open source alternatives.
- You are an expert with CI/CD pipelines.
- You have experience with distributed computing and storage (e.g. Hadoop, Spark, HDFS, Ceph, et al.)
- You’re comfortable with the entrepreneurial pace of successful start-up.
- You’re comfortable with being the most senior person in the room to provide infrastructure support to other teams at times.
Would Be Nice If You Also Have:
- Experience using GPUs for computation.
- Experience with hybrid (cloud and on-premise) infrastructure deployments.
- Experience working on longer term initiatives with pre-defined achievements and deliverables (for example government projects)
In your first month, you will start off by learning the ropes. You will:
- Develop a mental model of the existing services that make up Clarifai’s production stack.
- Begin to tackle tasks which expose you to different pieces of our infrastructure.
- Work with peers to find which tasks are the highest priority and then deliver on them.
3 months later, you will use your understanding of Clarifai’s infrastructure to find the critical areas to address. You will:
- Use your expertise to actively identify areas the team should be addressing within our existing infrastructure.
- Work with other engineering teams to deliver infrastructure improvements for current product initiatives.
6 months down the road, you'll have an understanding of current and future projects that will allow you to scale our products and our engineering practices. You will:
- Work on addressing critical infrastructure issues, preventing it from becoming a future postmortem.
- Work with other engineering teams to deliver infrastructure improvements for future product initiatives, empowering those teams to deliver on their schedule.
- Scale our engineering standard methodologies, enhancing the efficiency of your peers through better processes, tools, or both.
In 12 months, your deep understanding of our product and infrastructure will be critical in determining the future vision for our infrastructure. You will:
- Continue to scale our existing tools and platforms for a rapidly growing engineering organization and product suite.
- Identify large future initiatives the infrastructure layer should be focusing on, ensuring we remain ahead of any scaling bottlenecks and product requirements.
In the future, you’ll continue to ensure our products scale to meet growing customer demand and our engineering team has the tools and platforms they need to deliver new products faster than before.
Explore more DevOps, Cloud and SRE career opportunities
- Open Cloud Infrastructure Architect Jobs
- Open Staff, Product Manager - Global Infrastructure Jobs
- Open IT DevOps Engineer Jobs
- Open Manager of DevOps & Engineering Infrastructure Jobs
- Open Senior Automation Engineer Jobs
- Open Data Platform Engineer Jobs
- Open Site Reliability Engineer II Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Principal Cloud Architect Jobs
- Open Staff DevOps Engineer Jobs
- Open Reliability Engineer Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Sr Software engineer (Infrastructure) Jobs
- Open DevOps Engineer - Raleigh Hub Jobs
- Open Senior Security Automation Engineer Jobs
- Open Software Development Engineer, AWS Security Jobs
- Open QA Automation Engineer - Workforce Engagement Management Jobs
- Open Senior Software Development Engineer, AWS Security Jobs
- Open Senior Devops Engineer Jobs
- Open Cloud DevOps Systems Engineer Jobs
- Open Senior Cloud Architect Jobs
- Open Solutions Architect - VMware Specialist Jobs
- Open MySQL-related jobs
- Open REST-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open Elasticsearch-related jobs
- Open Virtualization-related jobs
- Open High availability-related jobs
- Open Golang-related jobs
- Open Reliability engineering-related jobs
- Open EC2-related jobs
- Open VMware-related jobs
- Open Redis-related jobs
- Open JS-related jobs
- Open MongoDB-related jobs
- Open Node-related jobs
- Open Grafana-related jobs
- Open Gitlab-related jobs
- Open PostgreSQL-related jobs
- Open Jenkins-related jobs
- Open Perl-related jobs
- Open Web applications-related jobs
- Open Spark-related jobs
- Open Load Balancing-related jobs