Systems Engineer / DevOps

San Francisco

Hive logo
Hive
Apply now Apply later

Posted 1 month ago

About Hive
Hive is a full-stack deep learning platform helping to bring companies into the AI era. We take complex visual challenges and build custom machine learning models to solve them. For AI to work, companies need large volumes of high quality training data. We generate this data through Hive Data, our proprietary data labeling platform with over 1,000,000 globally distributed workers, generating millions of high quality pieces of data per day. We then use this training data to build machine learning models for verticals such as Media, Autonomous Driving, Security, and Retail. Today, we work with some of the largest companies in the world to redefine how they think about unstructured visual data. Together, we build solutions that incorporate AI into their businesses to completely transform industries.
We are fortunate that investors like Peter Thiel (Founders Fund), General Catalyst, 8VC, and others see Hive's potential to be groundbreaking in AI business solutions. We have over 160 talented individuals globally in our San Francisco and Delhi offices. Please reach out if you are interested in joining the AI revolution!
DevOps and Systems Team
Our unique machine learning needs led us to open our own data centers, with an emphasis on GPU resources. Even with these data centers, we maintain a hybrid infrastructure into AWS to power some parts of our consumer apps. As we continue to commercialize our machine learning models, we also need to grow our DevOps and Systems team to maintain the reliability of a SaaS offering for our customers. Our ideal candidate is someone who is able to thrive in an unstructured environment and takes automation seriously. You believe there is no task that can’t be automated and no server scale too large. You take pride in ensuring developers can deploy their servers without worrying about downtime.

Responsibilities

  • Automate manual operational processes
  • Improve workflows of developer, data, and machine learning teams
  • Manage integration and deployment toolingCreate, maintain, monitor, and audit infrastructure
  • Manage a diverse array of technology platforms, following best practices and procedures
  • Participate in on-call rotation and root cause analysis

Requirements

  • Minimum 1 - 2 years of previous experience in development, operations, IT, or a related field
  • Comfortable working on Linux infrastructures (Debian) via the CLIAble to learn quickly in a fast-paced environment
  • Able to multitask, prioritize, and manage time efficiently independently
  • Able to physically lift equipment at least 30 pounds
  • Can communicate effectively across teams and management levels
  • Degree in computer science, or similar, is an added plus!

Technology Stack

  • Operating Systems - Linux/Debian Family/Ubuntu
  • Configuration Management - Chef/Ansible/Puppet/Salt
  • Containerization - Docker
  • Container Orchestrators - Mesosphere/Kubernetes
  • Scripting Languages - Python/Ruby/Node/Bash
  • CI/CD Tools - Jenkins
  • Network hardware - Arista/Cisco/Fortinet
  • Hardware - HP/SuperMicro
  • Storage - Ceph, S3
  • Database - Scylla, Postgres, Pivotal GreenPlum
  • Message Brokers: RabbitMQ
  • Logging/Search - ELK Stack
  • AWS: VPC/EC2/IAM/S3
  • Networking: TCP / IP, ICMP, SSH, DNS, HTTP, SSL / TLS, Storage systems, RAID, distributed file systems, NFS / iSCSI / CIFS
What We Offer You
We are a group of ambitious individuals who are passionate about creating a revolutionary machine learning company. At Hive, you will have a significant career development opportunity and a chance to contribute to one of the fastest growing AI startups in San Francisco. The work you do here will have a noticeable and direct impact on the development of Hive.
Our benefits include competitive pay, equity, health / vision / dental insurance, catered lunch and dinner, a corporate gym membership, etc.
Thank you for your interest in Hive.
Job tags: Ansible AWS Bash CD Chef CI Debian Docker EC2 ELK Kubernetes Linux Node Postgres Puppet Python RabbitMQ Ruby S3 Salt Ubuntu
Job region(s): North America
Share this job: