Home
/
Comprehensive
/
Data Engineer
Data Engineer-March 2024
Pune
Mar 28, 2026
About Data Engineer

  About Fusemachines

  Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 full-time employees). Fusemachines seeks to bring its global expertise in AI to transform companies around the world.

  About the role:

  This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics).

  We are seeking a skilled Data Engineer with a strong background in Python, Pyspark, and cloud-based big data applications. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Big Data products in the media and broadcasting industry. This role involves hands-on coding and collaboration with multi-disciplined teams to achieve project objectives.

  Qualification & Experience

  Must have a full-time Bachelor's degree in Computer Science or similar

  At least 2 years of experience as a data engineer with strong expertise in Azure or other hyperscalers.

  2+ years of experience with Azure DevOps, Azure Cloud Platform, or other hyperscalers.

  Proven experience delivering projects for Data and Analytics, as a data engineer

  Following certifications:

  Microsoft Certified: Azure Fundamentals

  Microsoft Certified: Azure Data Engineer Associate

  Databricks Certified Associate Developer for Apache Spark

  Databricks Certified Data Engineer Associate, nice to have

  Required skills/Competencies

  Strong programming Skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, storage, processing and manipulation.

  Strong experience using Markdown to document code or automated documentation tools (e.g PyDoc).

  Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have: experience with Azure Databricks is a plus), DBT and Kafka, to be able to handle large volumes of data.

  Strong experience in designing and implementing efficient ELT/ETL processes in Azure and using open source solutions being able to develop custom integration solutions as needed.

  Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.

  Expertise in data cleansing, transformation, and validation.

  Hands-on experience with Jupyter Notebooks and python packaging and dependency management: Poetry, PipEnv.

  Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).

  Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.

  Strong knowledge in SQL.

  Strong experience in designing and implementing Data Warehousing solutions in Azure with Azure Synapse Analytics and/or Snowflake.

  Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.

  Strong knowledge of SDLC tools and technologies Azure DevOps, including project management software (Jira, Azure Boards or similar), source code management (GitHub, Azure Repos, Bitbucket or similar), CI/CD system (GitHub actions, Azure Pipelines, Jenkins or similar) and binary repository manager (Azure Artifacts or similar).

  Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC), configuration management, automated testing and cost management.

  Knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics, such as Azure Data Factory, Azure Databricks, Azure Synapse Analytics (formerly SQL Data Warehouse), Azure Stream Analytics, SQL Server, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, etc.

  Experience in Orchestration using technologies like Apache Airflow

  Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.

  Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.

  Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.

  Experience with BI solutions including PowerBI and Tableau is a plus.

  Knowledge of containers and their environments (Docker, Podman, Docker-Compose, Kubernetes, Minikube, Kind, etc.).

  Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues.

  Strong written and verbal communication skills to collaborate with cross-functional teams, including data architects, DevOps engineers, data analysts, data scientists, developers, and operations teams.

  Ability to document processes, procedures, and deployment configurations.

  Understanding of Azure security practices, including network security groups, Azure Active Directory, encryption, and compliance standards.

  Ability to implement security controls and best practices within data and analytics solutions, including proficient knowledge and working experience on various cloud security vulnerabilities and ways to mitigate them.

  Self-motivated with the ability to work well in a team.

  A willingness to stay updated with the latest Azure services, Data Engineering trends, and best practices in the field.

  Care about architecture, observability, testing, and building reliable infrastructure and data pipelines.

  Responsibilities

  Design, develop, test and maintain high-performance, large-scale, complex data architectures, which support data integration (batch and real-time, ETL and ELT patterns from heterogeneous data systems: APIs and platforms), storage (data lakes, warehouses, marts, etc), processing, orchestration and infrastructure. Ensuring the scalability, reliability, and performance of data systems.

  Contribute to detailed design, architectural discussions, and customer requirements sessions.

  Actively participate in the design, development, and testing of big data products..

  Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive).

  Design and develop clear, maintainable code with automated testing using Pytest,

  unittest, etc.

  Collaborating with cross-functional teams and Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components meeting product deliverables.

  Evaluating and implementing new technologies and tools to improve data integration, data processing, storage and analysis.

  Evaluate, design, implement and maintain data governance solutions: cataloging, lineage, data quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns.

  Ensure data quality and accuracy.

  Design, Implement and maintain data security and privacy measures.

  Be an active member of an Agile team, participating in all ceremonies and continuous improvement activities, being able to work independently as well as collaboratively.

  Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.

  Powered by JazzHR

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
Physical Therapy Assistant / PTA - PRN
Position: Physical Therapy Assistant / PTA Hours: PRN Setting: Post Acute Rehab / Skilled Nursing & Rehab Center Location: HealthBridge Post Acute Rehab If you have a true heart for caring, you’l
Plant Operator - RMW
We are looking for a Plant Operator for our Daniels' medical waste treatment facility in Easton, who will be responsible for running our automated Washsmart system. Our Washsmart system mechanically
Patient Care Technician- PCT- Intensive Care Unit (ICU)
Details Department: Intensive Care Unit Schedule: Full Time (Day) Hospital: Dell Seton Medical Center Location: Austin, TX Benefits Paid time off (PTO) Various health insurance options & wellness
Principal Software Developer
Job Description Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc. As a member of the software engineering division, you will take an active r
District Support Pharmacist PT
Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced
Telesitter - Full Time
Create Your Career With Us! Sanford Health is one of the largest and fastest-growing not-for-profit health systems in the United States. We're proud to offer many development and advancement opportun
Porter, Service
AutoNation Honda O'Hare Position Overview The Service Porter is responsible for performing a variety of duties as it relate to the maintenance and service of vehicles. What are the day-to-day respons
Assistant Community Manager
We Care. We Deliver. Our purpose is to create communities our residents are proud to call home. We currently own or manage over 85,000 units in North America and continue to grow. B e ll Partners , h
Livestock Technician 54466
This job was posted by https://www.vermontjoblink.com : For moreinformation, please see: https://www.vermontjoblink.com/jobs/1117835\OFFICE LOCATION Pawhuska, Oklahoma, USA#Li-onsite#PDN WHO WE ARE T
Senior Member of Technical Staff (OCI-AI Platform)
Job Description At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors. We act with the speed and attitude of a star
Copyright 2023-2026 - www.zdrecruit.com All Rights Reserved