The Role
The Azure Databricks Developer, will be responsible for designing, developing, and maintaining data processing workflows and analytics solutions using Azure Databricks.
Use business requirements to drive the design of data solutions/applications and technical architecture. Create technical, functional, and operational documentation for data pipelines and applications.
Develop and maintain ETL (Extract, Transform, Load) pipelines using Databricks to process and transform large datasets.
Collaborate with data engineers and data scientists to design and implement scalable and efficient data processing workflows.
Build and optimize Apache Spark jobs and clusters on the Databricks platform.
Develop and maintain data ingestion processes to acquire data from various sources and systems.
Implement data quality checks and validation procedures to ensure accuracy and integrity of data.
Perform data analysis and exploratory data mining to derive insights from complex datasets.
Design and implement machine learning workflows using Databricks for predictive analytics and model training.
Coordinate and participate in structured peer reviews / walkthroughs / code reviews.
Work effectively in an Agile Scrum environment (JIRA / Azure DevOps).
Stay updated with the latest advancements in big data technologies and contribute to the improvement of existing systems and processes.
The Requirements
B.S. in Computer Science/Engineering or relevant field
8+ years of experience in the IT industry
3+ years of hands-on experience in data engineering/ETL using Databricks Notebook programming on Azure or any cloud infrastructure and functions
Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, NumPy in Azure environment.
Hands on experience of building data pipelines using Databricks and Apache Spark.
Hands on experience designing and delivering solutions using Terraform and Azure DevOps agents.
Creating mount points for ADLS Gen2 storage in DBFS to implement RBAC for end users.
Strong understanding of distributed computing principles and experience with large-scale data processing frameworks.
Experience with CI/CD on Databricks using tools such as AZDO Git, and Databricks CLI
Experience working with structured and unstructured data.
Strong understanding of Data Management principles (quality, governance, security, privacy, life cycle management, cataloguing). Unity Catalog experience desirable.
Experience with Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT)
Able to work independently
Excellent oral and written communication skills
Experience :
Databricks: 3 years (Required)
Azure: 3 years (Required)
Cloud development: 5 years (Required)
Python: 3 years (Required)
Nice to have: Azure Synapse, Databricks Lakehouse Architecture, Azure Data Factory (ADF), PowerBI, Predictive Analytics, AI/ML, Medallion architecture
Nice to have: Microsoft Azure Databricks and Azure Data Engineer certifications.
WTW is an Equal Opportunity Employer