Soumitra Shukla

Data Engineer

Bengaluru, Karnataka, India4 yrs 7 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in building scalable data pipelines.
  • Proficient in Python and Django for backend development.
  • Strong experience in data warehousing and ETL processes.
Stackforce AI infers this person is a Fintech Data Engineer with expertise in building scalable data solutions.

Contact

Skills

Core Skills

PythonDjangoData WarehousingEtlAws GlueData Engineering

Other Skills

AWS S3Adobe PhotoshopAirflowAmazon Elastic MapReduce (EMR)Amazon S3Apache AirflowApache IcebergCreative EntrepreneurshipDatabase tuningDjango REST FrameworkHTMLIntegrationKafkaKerasLeadership

About

I am a competent developer well versed in Algos and Data Struc. (Google Foobar level 3) Confident enough to show my skills in the interview process. Backend tech stack : Django, python Distributed tech stack : Kafka, Pyspark, Apache Iceberg, Apache Airflow . languages : Java and Python

Experience

Bright money

3 roles

SDE - II / Data Engineer

Promoted

Oct 2022Present · 3 yrs 5 mos

  • Data Warehousing solution: [ Debezium, Kafka Connect, Pyspark , Django]
  • Designing and implementing Kafka Connect Transforms for Data Enrichment
  • Designing and implementing Data Pipelines and Data Cubes
  • ETL pipelines on Iceberg tables for better indexing , compaction and queries.
  • Data validation and missing data computation for CDC replication flows of tables.
  • Implemented Buisness Specific wrapper on Databricks Iceberg Sink
  • Designing + Implementing microservice for Credit Report Data aggregation in realtime and batch.
  • Integration of Transunion credit report products ,with metro2 handling.
  • refresh pipelines to maintain 2.5M users data for downstream recency using Airflow Jobs
  • Did extensive analysis on data flow and product usage to help upgrade pipelines and helped for cost
  • optimisations
  • Data Backfill and Enrichment (ETL jobs):
  • Account + User Data Enrichment and Backfill for 2M accounts from a monolith to 9 microservices
  • [using pandas, Airflow , AWS S3]
  • Transactions data for 502M transactions
  • [using Pyspark , AWS EMR , AWS Glue Jobs , AWS S3, Airflow]
DjangoPythonKafkaPysparkApache IcebergApache Airflow+2

SDE

Aug 2021Oct 2022 · 1 yr 2 mos

  • Implementation + Ownership for end to end service of Account aggregation as 16 different microservices to support fetching financial data in realtime and batch.
  • Building scalable refresh pipelines for keeping 4M accounts up to date in a short period using
  • Airflow jobs.
  • Developing AWS Glue Jobs (PySpark) for providing data to various other services via S3.
  • Integration with Aggregators: Plaid, Fiserv, Capital One (Direct Integration), Finicity, Teller
  • Database and query-plans tuning for large tables with >2B records
  • Designing + Implementing service for Identity (KYC) Data validation in realtime.
  • Integration with products like LexisNexis
AWS GluePySparkAirflowDatabase tuningIntegrationData Engineering

Engineer Intern

Feb 2021Jul 2021 · 5 mos

Balloon

Web Developer Intern

Jun 2019Jul 2019 · 1 mo

Education

Institute of Engineering and Technology

Bachelor of Technology — Computer Science

Jan 2017Jan 2021

Stackforce found 100+ more professionals with Python & Django

Explore similar profiles based on matching skills and experience