Nikhil Chole

Data Engineer

Mumbai, Maharashtra, India5 yrs 7 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Designed CI/CD pipelines reducing deployment time significantly.
  • Developed a risk profiling framework recognized with awards.
  • Engineered real-time fraud detection enhancing security.
Stackforce AI infers this person is a Data Engineering expert in Fintech, specializing in data pipeline architecture and real-time analytics.

Contact

Skills

Core Skills

Data Pipeline ArchitectureDistributed Data ProcessingData Warehouse ArchitectureData EngineeringData Integration

Other Skills

PySparkBig DataJenkinsApache IcebergControl-MHiveData ModelingApache SparkPerformance OptimizationHadoopPython (Programming Language)KafkaSpark StreamingETLData Warehousing

About

An accidental Data Engineer who fell in love with the world of data systems :DOver the last 5+ years, I’ve built reliable data pipelines and platforms to fuel analytics and ML systems.Core Strengths:- Data pipeline architecture experience with on-premises and Cloud platforms (AWS).- Deep understanding of distributed data processing.- Proven Experience in Data modeling and warehouse design- Data Structures and Algorithms and Performance optimization for large-scale datasetsTech Stack:Python • Apache Spark • Kafka • SQL • Hive • Iceberg • S3 • CICD • Git • Informatica Power Center • Control M • Oracle Exadata • MPP Databases • Neo4j • Cypher I am particularly interested in distributed systems, data platform architecture, and building reliable data infrastructure at scale.

Experience

Axis bank

2 roles

Senior Data Engineer

Promoted

May 2024Present · 1 yr 10 mos · Mumbai · Remote

  • Designed and implemented a Jenkins CI/CD pipeline with a reusable shell script to standardize PySpark deployments across all Spark jobs. This removed the need for job-specific scripts and centralized JAR dependency management, reducing deployment time and saving several hours of manual effort in each release cycle.
  • Migrated legacy 100+ Hive tables to Apache Iceberg and built an automated maintenance pipeline. This modernization of the lakehouse architecture significantly reduced HDFS storage usage and removed the need for manual table maintenance.
  • Built a Credit Card Customer Feature Store that consolidated customer-level attributes into a single source of truth. This enabled reuse across multiple analytics, campaign, and ML pipelines, reducing duplicate development work for Data Science and Analytics teams.
  • Implemented a centralized source control framework integrated with Control-M file watcher jobs. Spark pipelines now trigger automatically only when all upstream data sources are ready, eliminating manual dependency monitoring.
  • Optimized existing Spark jobs by tuning configurations and restructuring processing logic, achieving an 80% reduction in processing time and lowering compute costs across the data platform.
PySparkBig DataJenkinsApache IcebergControl-MData Pipeline Architecture+1

Data Engineer

Aug 2020May 2024 · 3 yrs 9 mos · Mumbai · Remote

  • Developed a unified customer risk profiling framework integrating data from 10+ banking products for ~30M customers. This enabled centralized risk visibility across banking channels and was recognized with the Economic Times CIO Award 2024 for Excellence in Technology Implementation – Business Resilience Impact and an internal BIU Star Award.
  • Engineered real-time fraud detection pipelines using Kafka and Spark Streaming to process high-velocity transaction data. This enabled instant identification of suspicious transactions and strengthened fraud prevention across digital banking channels.
  • Implemented advanced data validation and sanitization processes within data pipelines. This reduced data anomalies by 30% and improved data quality, which increased fraud detection effectiveness by 20%.
  • Built a centralized fraud data mart and optimized ETL pipelines. This reduced processing time by ~80% and enabled faster fraud investigations and reporting for analytics and risk teams.
  • Collaborated closely with risk, fraud analytics, and compliance teams to design scalable data solutions. These systems helped prevent ₹12+ crore in financial losses and improved customer trust in digital banking platforms.
HadoopPython (Programming Language)KafkaSpark StreamingData EngineeringData Integration

Education

Indian Institute of Technology, Kanpur

Bachelor of Technology — Chemical Engineering

Jan 2015Jan 2019

Indian Institute of Technology, Kanpur

Bachelor's Degree — Economics

Jan 2019Jan 2020

Stackforce found 100+ more professionals with Data Pipeline Architecture & Distributed Data Processing

Explore similar profiles based on matching skills and experience