Sagnik Mukherjee

Data Engineer

Bengaluru, Karnataka, India3 yrs 5 mos experience

AI ML PractitionerAI Enabled

Key Highlights

Expert in building scalable ETL pipelines on cloud platforms.
Improved data freshness by 95% through innovative solutions.
Reduced operational costs by 34% with efficient data processing.

Stackforce AI infers this person is a Data Engineering expert in SaaS environments, specializing in ETL and data architecture.

Contact

Skills

Core Skills

Data EngineeringEtl

Other Skills

AnalyticsAutomated Machine Learning (AutoML)Azure CloudAzure Data FactoryBI PublisherBig DataBig Data AnalyticsBusiness InsightsBusiness Intelligence (BI)C++CloudCloud ComputingComputer ScienceContinuous Integration and Continuous Delivery (CI/CD)Data Analysis

About

Experienced Data Engineer with hands-on expertise in building and scaling batch ETL pipelines across Azure and Oracle Cloud. Proficient in Python, PySpark, Databricks, Azure Data Factory (ADF), Kafka, Oracle Data Integrator (ODI), Oracle Data Flow, and OCI Data Integration. Skilled in designing data architectures for Supply Chain and E-commerce domains, optimizing performance, and automating monitoring and pipeline orchestration.

Experience

Ey

Data Engineer

Nov 2025 – Present · 4 mos · Kolkata, West Bengal, India · Hybrid

Oracle

2 roles

Data Engineer L3

Promoted

Sep 2024 – Oct 2025 · 1 yr 1 mo · Bengaluru, Karnataka, India

Design and delivery of a batch data platform using Azure Data Factory to extract from Oracle DB, APIs, and CRM systems, improving data freshness by 95%.
Spearheaded a cross-functional initiative to unify ingestion logic across 12+ systems, standardizing parameter handling and eliminating 70% of manual QA escalations.
Facilitated stakeholder alignment by leading sprint retrospectives with category teams to review data quality and reporting KPIs.

Azure Data FactoryOracle DatabaseData QualityStakeholder ManagementData EngineeringETL

Data Engineer L2

Aug 2022 – Aug 2024 · 2 yrs · Bengaluru, Karnataka, India

Built scalable ETL pipelines using PySpark on OCI Data Flow, processing 50TB full loads and 30GB daily delta files to support analytics for business-critical domains.
Reduced compute cost by 34% through efficient partitioning strategies and use of broadcast joins in Spark jobs.
Developed a centralized monitoring and alerting solution, standardizing logging, retries, and failure notifications to reduce RCA time by 60%.
Enabled real-time alerting for 100+ pipelines via OCI Email Delivery, improving issue response and system reliability.
Partnered with functional teams to orchestrate 25+ ELT jobs using ODI and OCI Functions, spanning Finance, Procurement, and Lease Accounting.

PySparkOCI Data FlowMonitoring SolutionsData OrchestrationData EngineeringETL