Staff Site Reliability Engineer
Vonage Engineering Mission: We embody the notion of be what’s next now! We envision, develop and manage technology to connect the world. Our team brings excellence, passion, creativity and curiosity to the job. We look at the business environment and technologies in new and challenging ways, striving to develop and deliver integrated whole-system solutions to meet our customers’ ever-changing needs.
Why this role matters
Site Reliability Engineering provides the framework and systems enabling product development and operations teams to reliably run services at scale. We support the full lifecycle of site, system and software engineering, working with technical and business users to understand requirements and provide the required infrastructure for monitoring, scalability and resiliency testing. You can join our team as a Staff Site Reliability Engineer, helping us apply software engineering to bridge between development and operations and making our development / delivery pipelines ever more efficient.
IF THIS SOUNDS LIKE YOU, CONTINUE READING BELOW…….
What you will do
- Scope, size and ensure scalability of large product initiatives.
- Support engineering teams to ensure incorporation of monitoring, scalability and resiliency best practices into their product development lifecycle.
- Evaluate site system requirements for compliance, conformity, applicability, performance and cost effectiveness.
- Assess utilization trends and make appropriate design and implementation recommendations to support scaling of infrastructure.
- Manage projects to build scalable, process-focused software and systems.
- Ensure software systems effectively measure, monitor and alarm on the state of system and services stacks.
- Provide comprehensive administration and support for production systems and business end-users.
- Troubleshoot, perform root cause analysis and resolve production issues from network and application layers down to the system level (may include digging into source code, hunting memory leaks, tracing bottlenecks or databased query optimization).
- Investigate, evaluate and recommend new systems, services and solutions to enhance customer service and align with business drivers.
- Document and report on performance metrics and instances.
- Provide technical direction on improvement of system support services, tools and technologies.
- Provide team with procedural, process and technical insight and advice.
What you will bring
- Strong understanding of software engineering principles, practices and methodologies.
- Extensive experience with developing and/or operating cloud-based systems and applications at scale.
- Extensive experience with server automation tools like BladeLogic.
- Experience with Redhat Enterprise Linux installation, configuration and tuning.
- Experience with virtualization technologies like VMware.
- Strong knowledge of operational tools, software, network, databases and applications.
- A strong desire to learn, build and grow.
- Ability to:
- Measure systems performance and identify issues – including coding, performance, system, distributed systems and/or design.
- Identify opportunities for process and procedure improvement to drive efficiency and improve customer service levels.
- Deal effectively with technical and non-technical constituencies.
What is required for application
- Bachelor’s Degree in Software Engineering or related field.
- Prior experience (7+ years) in a Software Engineering, Site Reliability Engineering or related position.
What is in it for you
In addition to providing exciting work, career advancement opportunities, and a collaborative work environment, Vonage provides competitive pay and benefits including unlimited discretionary time off and tuition reimbursement.
Potential Next Career Move: Principal Site Reliability Engineer