We are looking for a Spark developer who knows how to fully exploit the potential of our Spark cluster. You will clean, transform, and analyze vast amounts of raw data from various systems using Spark to provide ready-to-use data to our feature developers and business analysts. This involves both ad-hoc requests as well as data pipelines that are embedded in our production environment.
Requirements
Roles and Responsibilities
- Responsible for systems analysis - Design, Coding, Unit Testing and other SDLC activities
- Requirement gathering and understanding, Analyze and convert functional requirements into concrete technical tasks and able to provide reasonable effort estimates
- Create Scala/Spark jobs for data transformation and aggregation
- Produce unit tests for Spark transformations and helper methods
- Design data processing pipelines
- Work proactively, independently and with global teams to address project requirements, and articulate issues/challenges with enough lead time to address project delivery risks
Requirements
- 10 - 12 Years hands on experience.
- Experience with Apache Spark streaming and batch framework
- Scala (with a focus on the functional programming paradigm)
- Experience in Azure cloud platform and Data Bricks
- Experience with Pyspark
- Scalatest, JUnit, Mockito
- Spark query tuning and performance optimization
- Experience with Mongoldb database
- Experience with Kafka, Storm, Zookeeper
- Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)
- Consistently demonstrates clear and concise written and verbal communication
- Ability to work in a fast-paced environment both as an individual contributor and a tech lead
- Experience in Git