Saikat Das, AI-DATA

Data Scientist

Pune, Maharashtra, India7 yrs 2 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in architecting scalable lakehouse architectures.
  • Proven track record in modernizing legacy ETL pipelines.
  • Strong mentorship skills, enhancing engineering best practices.
Stackforce AI infers this person is a Data Engineering expert in SaaS and Fintech environments.

Contact

Skills

Core Skills

Lakehouse ArchitectureDistributed Data ProcessingData ModelingMedallion ArchitectureGlobal Trade LakehouseData Model GovernanceData Pipeline EfficiencyEtl Data Migration

Other Skills

DatabricksAWSML feature engineeringdistributed ingestion pipelinesSnowflakeOracle Data EngineeringAI-assisted AutomationETL DesignOracle RDBMSPySparkSQLInformatica IICSSnapLogicPower BIData Modelling

About

Senior Data Engineer with 7+ years of experience designing and operating modern data platforms supporting analytics and machine learning workloads. My expertise lies in building scalable lakehouse architectures using Databricks and Apache Spark across AWS and Azure cloud environments. I specialize in distributed data processing systems, streaming data pipelines, and reliable data platforms that power analytics and ML-driven applications. Key areas of expertise: • Lakehouse Architecture (Databricks, Delta Lake) • Distributed Data Processing (Apache Spark / PySpark) • Streaming Data Systems (Kafka, Spark Structured Streaming) • Data Platform Modernization • Data Governance and Data Quality Frameworks • ML Feature Engineering Pipelines • AI-assisted Data Engineering Workflows In my recent roles I have worked on: • Architecting Databricks-based data platforms • Modernizing legacy batch ETL pipelines into streaming architectures • Building scalable feature pipelines supporting ML applications • Designing distributed data processing frameworks for regulatory reporting • Mentoring junior engineers and improving engineering best practices I am particularly interested in solving complex data infrastructure challenges and building data platforms that enable analytics, AI, and product intelligence. Technologies: Databricks • Apache Spark • PySpark • Delta Lake • Snowflake • Kafka • Airflow • AWS • Azure • Python • SQL

Experience

7 yrs 2 mos
Total Experience
1 yr 9 mos
Average Tenure
--
Current Experience

Proton.ai

Senior Data Engineer

Sep 2025Mar 2026 · 6 mos · Remote · Remote

  • Senior Data Engineer
  • Architected a Databricks Lakehouse platform on AWS enabling unified analytics and ML workloads.
  • Built distributed ingestion pipelines integrating multiple SaaS data sources.
  • Developed ML feature engineering pipelines supporting recommendation systems.
  • Mentored junior engineers on Spark optimization and scalable data platform design.
DatabricksAWSML feature engineeringdistributed ingestion pipelinesLakehouse ArchitectureDistributed Data Processing

Workday

Senior Data Engineer

Oct 2024Sep 2025 · 11 mos · India · Remote

  • Medallion Architecture Design: Architected a real-time Medallion Lakehouse on Snowflake with structured zone-based modeling (Bronze → Silver → Gold), integrating Oracle ERP and Salesforce feeds for unified analytics.
  • Enterprise Data Modeling: Built dimensional and third-normal-form (3NF) models aligning with enterprise architecture frameworks; standardized business entities and relationships across finance and HR domains.
  • AI-Enhanced Data Modeling: Used Claude Code and Agentic AI to auto-generate DDLs, schema evolution scripts, and PySpark transformations, reducing overall modeling time by 35%.
  • Oracle Integration: Developed optimized ingestion pipelines using Snowpipe and Python connectors for Oracle extracts, ensuring schema consistency and incremental data capture.
  • Key Skills: Data Modeling, Oracle Data Engineering, Snowflake, Medallion Architecture, PySpark, AI-assisted Automation, ETL Design, Semantic Layering
SnowflakeOracle Data EngineeringAI-assisted AutomationETL DesignData ModelingMedallion Architecture

Ubs

Senior Data Engineer (Gold UBS Certified Engineer)

Mar 2022Oct 2024 · 2 yrs 7 mos · India · Remote

  • Global Trade Lakehouse: Designed and modeled the Databricks Medallion Lakehouse to process 500GB+ of daily trade data, integrating feeds from Oracle RDBMS and on-prem data warehouses.
  • Data Model Governance: Established gold zone semantic models and lineage tracking to maintain metric integrity and compliance with MIFID II & Basel III.
  • Performance Optimization: Migrated and refactored SQL and PL/SQL logic from Oracle into vectorized PySpark, improving query response times and cutting cloud costs by $45,000 annually.
DatabricksOracle RDBMSPySparkSQLGlobal Trade LakehouseData Model Governance

Infosys

3 roles

Data Engineer (Technology Analyst)

Promoted

Apr 2021Feb 2022 · 10 mos · Bengaluru, Karnataka, India

  • ➢ Increased data pipeline efficiency: Utilized Informatica IICS and SnapLogic to automate data
  • ingestion and integration processes, demonstrably improving efficiency by 45%.
  • ➢ Delivered high-quality data solutions: Designed and implemented ETL data migration & integration pipelines
  • using PySpark/SparkSQL and Databricks Pipelines ETL Notebooks, ensuring data accuracy and meeting client
  • requirements. (99.5% data accuracy rate)
  • ➢ Empowered business stakeholders: Provided data insights through Power BI dashboards and reports, enabling
  • stakeholders to make data-driven decisions. (increased user adoption of reports by 65%)
  • ➢ Enhanced cross-functional collaboration: Collaborated with marketing, sales, and product development teams
  • to identify data needs and deliver analytical support, resulting in a 70% improvement in a key metric. ( increased
  • marketing campaign ROI by 25%), also enabled AI Interactive Power BI Dashboards with QnA Feature
  • ➢ Maintained data integrity: Established rigorous validation processes and regular data audits to guarantee data
  • accuracy and reliability via Alation. ( reduced data discrepancies by 96%)
Informatica IICSSnapLogicPower BIPySparkData Pipeline EfficiencyETL Data Migration

Data Engineer (Sr. System Engineer)

May 2020Apr 2021 · 11 mos · Bengaluru, Karnataka, India

Data Engineer (System Engineer)

Nov 2018Apr 2020 · 1 yr 5 mos · Bengaluru, Karnataka, India

Education

Heritage Institute of Technology

Bachelor's degree — Computer Science

Aug 2014Jul 2018

St. Xavier's Collegiate School - India

Junior college — Computer Science

Jan 2012Jan 2014

Stackforce found 100+ more professionals with Lakehouse Architecture & Distributed Data Processing

Explore similar profiles based on matching skills and experience