Cloud Platform Engineer, Infrastructure
Braintree lets you move money from one place to another safely and securely. Every time you pay for an Uber ride, book a stay through Airbnb, or pay with PayPal when you check out online, you’re probably using our product. It sounds complex (and it is), but we make it so simple you can’t tell we’re there.
We solve world-scale problems and provide opportunities to match. We build diverse teams that recognize our strengths and allow us to work on our weaknesses. You bring skills and a relentless focus on the customer, and we'll provide the support you need to do the best work of your life.
Our Production team is growing! The Production Engineers embrace Unix systems -- from the production clusters down to the desktop. Single purpose, well designed tools are always preferred over the combination wifi-coffee-maker-ping-pong-table. We enjoy the stability and flexibility Unix based systems provide. You will find us swimming in technical solutions, that keep the business engaged and moving forward.
Our team does work that ranges from debugging dynamic routing protocols, writing nagios checks for an NTP servers, to tweaking the office 802.11b channels to avoid a noisy microwave all in the same day. Our responsibility is to ensure the technology supports and accelerates our business. Stability, scalability, fault tolerance, and automation are key properties of our solutions. We strive to provide our clients with 100% system availability, as we don't make money unless our clients do.
As a Cloud Platform Engineer, you will be responsible for building the next generation platform for our payments in the cloud. You will be constantly collaborating with our customers, tech leadership and your peers, learning and teaching new ways of making our platform even more resilient, robust and scalable.
What You’ll Do
- Develop solutions and tools to make the lives of Braintree Product engineers better and easier. You will develop solutions from ideation and design, through development, launch, operation and iteration.
- Partner with our customers (product engineering teams) on their products design, development and capacity planning to ensure Braintree continues to scale and maximize availability.
- Brainstorm and implement ways of reducing tech debt, automate repeated manual tasks and improve team productivity.
- Ideate new ways of doing things, publish RFCs, get buy-in from other engineering leads and implement changes.
- Be an active member of the open source community by reporting new defects and issues, contributing to open source projects and providing help to the community at large.
- Ensure sufficient logging, monitoring and alerting strategies around availability, latency and overall system health.
- Scale systems sustainably through automation, and evolve systems by pushing for changes that improve reliability and velocity.
- Be part of incident reviews and blameless post mortems.
- Mentor other members of the organization through their career journey at Braintree.
What We’re Looking For
- Software Development background with ability to analyze and improve existing codebase.
- Experience with building solutions on the Cloud (Ideally AWS)
- Established ability to diagnose technical problems, debug code, and automate routine tasks.
- Ability to support a 24/7/365 always available production grade service.
- Experience in one or more of the following: Java, Ruby, Golang, or shell scripting.
- Experience with Unix/Linux operating systems internals and administration.
- Patience and fortitude to debug complex issues in production system, which a akin to finding a needle in haystack sometimes
- Great analytical and problem solving skills.
- Familiarity with orchestration tools (Ansible, Puppet, Chef, Terraform, etc.).
- Established experience with monitoring/logging tools and best practices.
- Experience in software release lifecycle with modern distributed version control (e.g. git).
- Proficiency in managing cloud based large-scale infrastructure.
- Expertise in designing and troubleshooting large scale distributed systems.
- Strong communicator, both written and spoken.