DevOps Manager

San Francisco or Remote

Labelbox logo
Labelbox
Apply now Apply later

Posted 2 weeks ago

Labelbox’s mission is to build the best products for humans to advance artificial intelligence. Real breakthroughs in AI are reliant on the quality of the training data. Our training data platform enables organizations to improve their machine learning models far quicker and more accurately. We are determined to build software that is more open, easier-to-use, and singularly focused on getting our customers to performant ML faster.  
Current Labelbox customers are transforming industries within insurance, retail, manufacturing/robotics, healthcare, and beyond. Our platform is used by Fortune 500 enterprises including Allstate, John Deere, Bayer, Warner Brothers and leading AI-focused companies including FLIR Systems and Caption Health. We are backed by leading investors including Andreessen Horowitz, B Capital, Gradient Ventures (Google's AI-focused fund), and Kleiner Perkins.

What you’ll be doing

  • Managing a team of DevOps / Operations / Site Reliability Engineers responsible for supporting the Labelbox cloud and on-premises offerings
  • Organize, plan, and action on an infrastructure roadmap including areas such as cloud infrastructure, on-premises infrastructure, and developer productivity
  • Collaborating with other Engineering Managers on cross-cutting team initiatives 
  • Working with database technologies such as PostgreSQL, MySQL, or other RDBMS
  • Working with open source technologies such as Redis, Elasticsearch, and RabbitMQ
  • Helping individuals with career growth and development

What we’re looking for

  • 6+ years of DevOps / Operations / Site Reliability Engineering experience
  • 3+ years of technical management experience
  • Experience with public cloud infrastructure
  • Experience managing data at scale
  • Experience building and managing data pipelines
  • Experience with CI/CD
  • Someone willing to roll up their sleeves to get hands dirty if / when needed

Bonus

  • Experience with automation tools and technologies such as shell scripting, Terraform, Helm, etc
  • Experience deploying, maintaining, and automating services in on-premises environments
  • Coding skills in languages such as Java or Golang
  • Experience with SOC 2, FedRAMP, HIPAA, and other compliance-related programs
  • Experience managing multiple Kubernetes clusters / clusters spanning multiple cloud providers
  • Advanced knowledge of infrastructure management in GCP
Job tags: CD CI Elasticsearch GCP Golang Java Kubernetes MySQL Open source PostgreSQL RabbitMQ Redis Reliability engineering Terraform
Job region(s): Europe Remote/Anywhere
Job stats:  6  1  0
  • Share this job via
  • or