Software Engineer - Site Reliability, Network Edge
New York, USA, Remote; Massachusetts, USA, Remote
We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
The Network Edge team is a mix of software engineering and systems focused engineers who manage the systems exposing our applications and services to the internet for all products. We build load-balancing systems in a multi-cloud environment that can ingest large throughputs and safely react to anomalies (malicious or not). As a first point of contact in the infrastructure there is a huge focus on building robust systems working with all the different intake teams at Datadog for the different products and providing those teams observability into what happens to their traffic at the edge.
As an engineer on the Network Edge team, you will contribute to building the Edge infrastructure required for exposing all Datadog services over the Internet, designing solutions that work at scale and with multi-cloud and multi-region constraints. You will work with infrastructure that is essentially the first system traversed by all of our customers' data so thinking of high availability and resilience at every step of the way is critical.
- Debug and maintain day-to-day our existing infrastructure, requiring deep technical analysis and working with our cloud providers to solve unprecedented issues.
- Build the next generation of our infrastructure using cutting edge technology like Envoy, xDS, eBPF, Web Assembly...
- Build applications and tooling to help automate and manage the network edge infrastructure in a multi-cloud, multi-region, multi-tenant environment.
- Work with product engineering teams to understand the constraints of their customers and design solutions to support their traffic.
- You have been dealing with systems at scale processing GB/s of data for 4+ years and know the systems you’ve worked on from top to bottom.
- You have experience with critical Internet-facing infrastructure like BGP, DNS, load balancing, CDNs, DDoS mitigation systems…
- You have programing experience with languages like Golang/Python/C/C++
- Good understanding of Linux internals.
- You want to work in a fast-paced, high-growth startup environment that respects its engineers and customers.
- You have significant public cloud experience, ideally with more than one cloud provider
- You have a deep understanding of Linux networking
- You have worked with load balancing technologies
- You’re familiar with the Kubernetes APIs
<p style="background-color:white;color:white;">#LI-Remote This is a remote position</p>
Equal Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s Applicant and Candidate Privacy Notice.
Explore more DevOps, Cloud and SRE career opportunities
- Open Staff, Product Manager - Global Infrastructure Jobs
- Open Senior Infrastructure Security Engineer Jobs
- Open Manager of DevOps & Engineering Infrastructure Jobs
- Open Reliability Engineer Jobs
- Open Linux Infrastructure Developer Jobs
- Open Site Reliability Engineer II Jobs
- Open Senior Software Engineer - Site Reliability - Toronto Hub Jobs
- Open Automation Engineer Jobs
- Open Senior Test Automation Engineer Jobs
- Open Senior Automation Engineer Jobs
- Open Sr. Site Reliability Engineer Jobs
- Open Lead Site Reliability Engineer Jobs
- Open Data Platform Engineer Jobs
- Open Senior DevOps Engineer - Pleasanton Hub Jobs
- Open Principal Cloud Architect Jobs
- Open Cloud Security Engineer Jobs
- Open DevOps Infrastructure Engineer Jobs
- Open Senior Cloud Architect Jobs
- Open Senior DevOps Engineer - New York Hub Jobs
- Open DevOps Security Engineer Jobs
- Open Senior Software Engineer, DevOps Jobs
- Open Senior DevOps Engineer - Boston Hub Jobs
- Open Software Engineer, Data Infrastructure Jobs
- Open Staff DevOps Engineer Jobs
- Open Staff Software Engineer (L4), Segment Infrastructure Jobs
- Open Kafka-related jobs
- Open REST-related jobs
- Open Unix-related jobs
- Open CloudFormation-related jobs
- Open Prometheus-related jobs
- Open Elasticsearch-related jobs
- Open S3-related jobs
- Open Jira-related jobs
- Open PowerShell-related jobs
- Open Golang-related jobs
- Open Virtualization-related jobs
- Open High availability-related jobs
- Open TCP-related jobs
- Open VMware-related jobs
- Open EC2-related jobs
- Open JS-related jobs
- Open Redis-related jobs
- Open Node-related jobs
- Open MongoDB-related jobs
- Open TCP/IP-related jobs
- Open Grafana-related jobs
- Open Gitlab-related jobs
- Open NoSQL-related jobs
- Open PostgreSQL-related jobs