D

Divakar Singh

Data Engineer

Pune, Maharashtra, India5 yrs 7 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Architected a unified Kafka ingestion framework saving 70% in costs.
  • Developed fraud detection pipeline reducing fraudulent activity by over 20%.
  • Redesigned ETL pipelines achieving 40% performance improvement.
Stackforce AI infers this person is a Data Engineering expert in SaaS and Healthcare industries.

Contact

Skills

Core Skills

Data EngineeringData Architecture

Other Skills

AWS (S3, Lambda, EMR)AirflowAirflow (MWAA)AlgorithmsApache CassandraApache KafkaBigQueryC (Programming Language)C++CI/CDCassandraComputer VisionContinuous Integration and Continuous Delivery (CI/CD)DBMSData Analysis

About

I'm a passionate Data Engineer with 5+ years of experience building scalable, cost-efficient, and real-time data systems across high-growth product and analytics teams. At StockX, I architected and delivered a unified real-time + batch Kafka ingestion framework, replacing legacy tools and reducing infra costs by 70%. I’ve also led the creation of Customer360 Delta tables and fraud detection pipelines used across product, marketing, and BI orgs. Previously at Figmd, I built large-scale anomaly detection systems and modernized ETL pipelines to improve performance by up to 40% and reduce turnaround times for production incidents from 1 week to under 2 days. 🔹 What I Do: ✔ Architect streaming & batch pipelines with Apache Kafka and Spark ✔ Build medallion-style Lakehouse architectures using Delta Lake, Unity Catalog & dbt ✔ Design scalable infra on AWS/GCP using Terraform & CI/CD pipelines ✔ Enable Customer360 insights, fraud detection, and CRM initiatives across business teams ✔ Implement data quality and lineage frameworks for audit and SLA compliance 💡 Passionate About: 🔹 Turning messy data into reliable, governed, and fast-access insights 🔹 Optimizing cost, observability, and resilience in production data pipelines 🔹 Staying hands-on with open-source data tools & new paradigms 🔹 (Optional) Exploring how GenAI + LLMs can power smarter data engineering workflows 🏆 Awards & Highlights: • Creative Solutions Award – StockX (2024) • Best Innovation – Figmd (2022) • 17th National Rank – CodeIT Suisse Hackathon (2021) 🛠 Tools I Work With Regularly: Python, PySpark, Apache Spark, Kafka, dbt, Airflow, Databricks, Terraform, Delta Lake, Unity Catalog, Schema Registry, DynamoDB, PostgreSQL, BigQuery 🧠 I enjoy building clean, reliable, observable data pipelines that directly impact business outcomes. Curious by nature, always optimizing — whether it’s cost, latency, or reliability. 👋 Always happy to learn, collaborate, and build with fellow engineers and data-driven minds.

Experience

Stockx

Data Engineer

May 2022Present · 3 yrs 10 mos · Bengaluru, Karnataka, India

  • At StockX, I led the design and development of real-time and batch data pipelines that support global-scale analytics, marketing personalization, fraud detection, and operational monitoring.
  • Key Contributions:
  • Architected a unified Kafka ingestion framework replacing Lambda + Kafka Connect; achieved 70% infra cost savings and improved scalability across ingestion jobs.
  • Built Customer 360 Delta Lake tables to consolidate customer behavior and transaction data — now a central source for BI, CRM, and product analytics.
  • Developed the “Seller Friction” fraud detection pipeline, reducing fraudulent seller activity by 20%+ through near-real-time behavioral tracking.
  • Implemented a real-time ad targeting pipeline (AIA) using Kafka, Spark, and schema registry for responsive ad delivery workflows.
  • Standardized diff workflows using dbt + Unity Catalog, ensuring data traceability and audit compliance across stakeholder teams.
  • Created a Kafka ingestion monitoring framework that proactively tracks pipeline lags — critical for preventing failures during peak events like holiday sales.
  • Delivered automation frameworks for scalable deployments using Terraform, Databricks jobs, and CI/CD integrations.
Apache KafkaSparkDelta LakeTerraformCI/CDDatabricks+3

Figmd, inc.

Software Engineer

Aug 2020May 2022 · 1 yr 9 mos · Pune, Maharashtra, India · Remote

  • At Figmd, I focused on improving data pipeline performance, automating infrastructure operations, and enabling downstream analytics teams with faster, cleaner data. My work reduced costs, improved turnaround time for incident resolution, and strengthened the reliability of core healthcare reporting systems.
  • Key Contributions:
  • ⚙️ Redesigned and optimized the CDR-to-Datamart ETL pipeline, resulting in 40% faster performance and up to 50% infrastructure cost savings.
  • 🕵️ Developed a data anomaly backtracking system, enabling teams to trace root causes and reduce ticket resolution time from 1 week to under 2 days.
  • 📅 Built and automated the Cassandra upgrade/downgrade pipeline end-to-end — including schema migrations, version tracking, and JIRA integration for ops reporting.
  • 📈 Designed a time-series forecasting tool using FbProphet to predict patient visit counts, improving downstream resource planning for clinical teams.
  • 🔍 Worked closely with analysts and QA teams to ensure high data quality and consistency across downstream reports and dashboards.
ETLCassandraFbProphetData QualityData AnalysisData Engineering

Education

Army Institute of Technology, College of Engineering,Pune

Bachelor of Engineering - BE — Computer Science

Jan 2016Jan 2020

Kendriya Vidyalaya

Stackforce found 100+ more professionals with Data Engineering & Data Architecture

Explore similar profiles based on matching skills and experience