MATHEW S

Data Engineer

Bengaluru, Karnataka, India4 yrs 8 mos experience

AI ML PractitionerAI Enabled

Key Highlights

Expert in building automated data engineering pipelines.
Strong background in dimensional modeling and data warehousing.
Currently pursuing a Master's in AI to enhance data science skills.

Stackforce AI infers this person is a Data Engineering expert in the Fintech industry.

Contact

Skills

Core Skills

Data EngineeringData Architecture

Other Skills

AbstractingAgile Software DevelopmentAnalytical SkillsAttention to DetailAzure Blob StorageBig DataBusiness AnalysisBusiness InsightsBusiness Intelligence (BI)Business RequirementsC (Programming Language)CalculusCascading Style Sheets (CSS)ClassificationCommunication

About

Driven by a deep-rooted passion for data and AI, I am dedicated to building robust, automated data engineering pipelines, developing predictive and forecasting models, and driving data-driven insights. Currently, I am honing my skills further by pursuing a Master’s in AI from [International University of Applied Science, Germany] and delving into advanced data science concepts at [Indian Institute of Technology Madras]. In my role as a Manager in Data Engineering at Axis Bank, I leverage my expertise to deliver innovative solutions and optimize data processes.

Experience

4 yrs 8 mos

Total Experience

2 yrs 1 mo

Average Tenure

1 yr 1 mo

Current Experience

Ey technology solutions

Data Engineer

May 2025 – Present · 1 yr 1 mo · Bengaluru, Karnataka, India · Hybrid

Building data warehouse for US insurance domain using industry standard data modelling concepts, medallion architect, using cutting-edge technologies - MS Fabric, databricks

Data WarehouseData ModellingMS FabricDatabricksData EngineeringData Architecture

Axis bank

2 roles

Data Engineer | Manager: BIU

Promoted

May 2024 – Dec 2025 · 1 yr 7 mos · Hybrid

Dimensional Modelling - Data Warehouse
Architecting star schema Data Warehouse involving SCD-Type 0, SCD-Type 1 and SCD-Type 2. This was implemented based over Watermark columns, Hash Value functions and Latest Extracted Timestamp. Data pipeline build using SQL queries, PL SQL and ETL tool for orchestrating dataflow.
Performance tracking / KPI Datamart
Iceberg tables where used to develop KPI DataMart involving continuous tracking and performance improvement involved. Spark engine provided the computational power and used Py-spark scripts in developing the pipeline. Involving the creation of Metadata Table, Parameter control table, History table for retention and Performance tracking table.

Dimensional ModellingData WarehouseSQLPL SQLETLSpark+3

Data Engineer | Deputy Manager: BIU

Jun 2022 – May 2024 · 1 yr 11 mos · Hybrid

1.Accumulating Periodic Snapshot Architecture.
Creation of Accumulating Periodic Snapshot table and Incremental table which is further used in an Application. This was a system to system data integration, involving connection with two different databases Oracle and MY SQL. Informatica as ETL engine used the integration tool for system to system push.
2.Data Pipeline integrating FTP servers.
Data assets is streamlined over FTP servers business report and analytical consumption. This involves both inward pull and outward push.
3.Azure Blob Storage and dashboards
Dashboard reporting involving data assets availability in cloud storage for processing capability, where ADLS act as the cloud platform. IICS ( Intelligent Informatica Cloud Service ) is utilised as the cloud ETL engine having better scalability and processing power to push data to cloud storage.
4. Selenium Automation
External Data assets need to be downloaded from third party and open source sites. Here ETL Integration tools have there own limitations. So Selenium packages available in python was used to automate mouse click and download data assets.
5. Token Based Even Trigger
Much of the business case involve in updating Data marts or Data cubes only once the depended Data assets is updated or after some events. This done by creating a token in ETL server after the dependent event, once the token is generated then the event is triggered and updates the reports.
6. Production Monitor Framework.
Once the development is completed each development, production monitoring is most important and time consuming task. As an Engineer our task is to automate thing reduce human efforts. For this a control table for capturing job run details, this is then used in some shell scripts to monitor job status, in each hour this script will produce a report over Running, success, and failed tasks based on Data Quality checks as well. Once the report is generated we need to look into those jobs where DQ has failed.