Soumav Prakash

Backend Engineer

Bengaluru, Karnataka, India9 yrs 7 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in building efficient data pipelines and orchestration.
  • Led the development of near real-time anomaly detection frameworks.
  • Proficient in AWS services and big data technologies.
Stackforce AI infers this person is a Data Engineering expert with a strong focus on big data solutions in SaaS.

Contact

Skills

Core Skills

Data EngineeringApache SparkBig Data Analytics

Other Skills

Amazon Web Services (AWS)Apache AirflowApache KafkaApache Spark StreamingArchitectural DesignBIG DATACC++Data ArchitectsData CleaningData LoadingData MiningData ScienceDjangoDjango REST Framework

About

Driving the Data pipelines for insights of key Walmart initiatives and helping the stakeholders take data driven decisions. Worked on distributed systems and creating efficient data pipelines along with orchestration of Jobs. Experience in Product Integrations, Building Spark/Hadoop based Applications, Scheduling Data Pipelines via Airflow, Implementation of AWS Services, Deployment on Kubernetes and Docker. Previously was involved in Building the Expense Management Product with planning the features, Product Roadmaps, Data Integrations, API Integrations, Product Implementation. Tech Stack - Apache Spark , Apache Hudi , Deltalake , Streaming ,Hive , Hadoop ,Apache airflow Data Platforms Handled - AWS EMR , AWS S3 , Bigquery , Dataproc , SFTP/FTP , HDFS , Kafka Languages - Python , SQL,Java , Scala , C

Experience

Walmart global tech india

2 roles

Senior Data Engineer

Promoted

Sep 2024Present · 1 yr 6 mos · Hybrid

  • Doing More of what i used to do earlier, more efficiently now :p
  • Built the Automatic Trigger framework for Anomalies detected in the Generic Notification Framework for use-cases like deactivations,suspensions of fraudulent drivers.
  • Helping in Design and infra setup of various implementations across the team.
Data LoadingData ArchitectsEngineering Data ManagementExtract, Transform, Load (ETL)Python (Programming Language)Apache Airflow+16

Data Engineer III

Mar 2021Sep 2024 · 3 yrs 6 mos · Hybrid

  • Near-Real-Time Ingestions
  • Developed a dispatcher delivery NRT ingestion system with latency as low as 3 minutes, handling 200 million to over 1 billion records per day.
  • Implemented Walmart Go-Local delivery NRT ingestion with latency as low as 1 minute, ensuring timely data processing.
  • Today as of now, there are multiple teams heavily dependent on both of these data sources for critical NRT Monitoring, Analysis for daily store/Last Mile performance.
  • Near Real-Time Anomaly Detection Framework(D-Notify)
  • Developed a near real-time (NRT) anomaly detection framework with seamless API integration capabilities with Walmart Services with ongoing pilot of deactivation of drivers for fraud.
  • Implemented a Generic Notification system with the ability to send out UI and email push notifications and configuring manual or automatic API calls on failures.
  • Established best practices for alert query expressions and database design to streamline alert onboarding.
  • Built the looker integrations along with data modelling for viewing the execution results.
  • Last Mile Data Product (D-Scribe)
  • Designed and built ETL processes to aggregate over 350 driver and market-level metrics using.
  • Working along with the data science team on generating the Driver Fraud score on near-real time for adopting it to enable better offers, incentives and reliability to Walmart Customers.
  • Developed backend services for APIs managing driver profiles, summary tabs, notifications, actions, and more which will reduce the current Manual review process to 15 seconds (3 Minute existing).
  • Leading the Trust and Safety and Driver Metrics aspects of the product, including requirement discussions and PRD design.
  • Working on improving the platform by analysing various types of frauds to prevent fraudulent activities by drivers on the Walmart Spark Platform.
  • Cost & Memory Optimizations
  • Reduced Class A and Class B costs by optimizing ETL pipelines through Spark optimizations and serverless adoption.
Data LoadingApache SparkScalaPySparkGoogle BigQueryData Architects+11

Quaero

Data Engineer

Jul 2020Mar 2021 · 8 mos · Bengaluru, Karnataka, India

  • Customer Data Platform
  • Engineered extract-transfer based connectors integrating S3, SFTP, HDFS, and FTP, facilitating the onboarding of major clients onto the platform.
  • Developed and implemented the Twitter Ads integration feature for the platform, enabling efficient marketing campaign creation and audience uploads.
  • Designed and tested Proof of Concepts (POCs) for third-party connectors, evaluating scalability and new functionalities.
  • Enhanced platform security by implementing IAM-role based authentication for AWS services.
Data LoadingApache SparkPySparkData ArchitectsApache AirflowEngineering Data Management+5

Happay - expense management solution for businesses

Product Integration Specialist

Jan 2019Jul 2020 · 1 yr 6 mos · Bangalore

  • Product Integrations
  • API Integrations
  • Data Integrations
  • Product Implementation
Data LoadingEngineering Data Management

Accenture

Associate Software Engineer

May 2017Jan 2019 · 1 yr 8 mos · Bengaluru, Karnataka, India

  • Developed a robust platform to process meter readings combined with user-weather data for smart grid infrastructures, leveraging big data technologies.
  • Optimized performance by identifying and configuring appropriate memory/core settings for Drivers and Executors, and implementing efficient partitioning/repartitioning strategies.
  • Facilitated seamless release and upgrade activities, ensuring smooth deployments and minimal system downtime.
  • Automated test scripts using Selenium, enhancing testing efficiency and precision.
Data LoadingEngineering Data ManagementSpark

Veer surendra sai university of technology (vssut,formerly uce), burla

Placement Coordinator

Aug 2016Aug 2017 · 1 yr · burla

  • Reach out to the recruiters and provide the best talent to them

Iit hyderabad

Microsoft Sponsored Research Intern

May 2016Jul 2016 · 2 mos · IIT HYDERABAD

  • Worked on the Project “ Pilot Recommender System using Big Data Analytics on taxi-out patterns of a flight for path optimization ” sponsored by Microsoft Research India under the guidance of Dr. Sobhan Babu from 16th May 2016 to 16th July 2016 at the Dept. of CSE , Indian Institute of Technology, Hyderabad

Indian institute of technology, kharagpur

SRIC sponsored research intern

May 2015Jul 2015 · 2 mos · Kharagpur Area, India

  • IOT platform replication using advanced methods to implement plug n play architecture. was involved in creating the whole front end as well as back end for the system.

Education

Veer Surendra Sai University Of Technology (VSSUT,Formerly UCE), Burla

Bachelor's degree — Information Technology

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Data Engineering & Apache Spark

Explore similar profiles based on matching skills and experience