Ritvik Raj

Product Engineer

Pune, Maharashtra, India10 yrs 11 mos experience

Key Highlights

  • 75% reduction in data processing time through optimizations.
  • Architected scalable data solutions for major telecom and supply chain companies.
  • Expert in modern data engineering and cloud-native platforms.
Stackforce AI infers this person is a Data Engineering expert with a strong focus on Big Data and Cloud-native solutions.

Contact

Skills

Core Skills

Data EngineeringCloud-native PlatformsStreaming InfrastructureData ObservabilityBig DataComputer VisionDeep LearningMachine Learning

Other Skills

APMAWSAlgorithmsAlpha GenerationAmazon Web Services (AWS)AnalyticsAndroid DevelopmentAnsibleApache AirflowApache KafkaApache NiFiApache SparkApache SqoopApache ZeppelinApplied Mathematics

About

- Lead Data Engineer @ Blue Ridge | Ex-Jio | Mathematics and Computing (MnC) @ IIT Roorkee | Tohoku University | UiT | WorldQuant Building Next-Gen Data Platforms Transforming big data chaos into business-critical insights through innovative engineering. Currently leading the Data Platform Initiative at Blue Ridge, where I modernise legacy systems and architect scalable data solutions. Across my career, I’ve delivered measurable impact: 75% reduction in data processing time via streaming optimisations, 90% pipeline performance boost by migrating from batch to real-time, 40% improvement in Jio’s cellular network latency affecting millions across India through data-driven network optimisation, and 80% cost savings through vendor evaluation, project sign-offs, and in-house builds. As a Software/Data Engineer with deep technical expertise and several years of experience, I specialise in modern data engineering with expertise across Hadoop, SQL, cloud-native platforms (AWS, Databricks), and next-gen frameworks like Polars and Rust. My work spans real-time streaming architectures to cloud-native data platforms, gained through experience at telecom giants (Jio), supply chain leaders (Blue Ridge), and data observability platforms (Unravel Data). Big Data & Streaming: Apache Spark, Kafka, Hadoop, PySpark, Scala, Python, Rust, Polars Cloud & Orchestration: AWS, Azure, Databricks, Airflow, Jenkins SQL & Analytics: Deep experience with SQL-based technologies and analytics solutions Performance Optimisation: Built petabyte-scale systems at Jio; expert in Spark optimization, Kafka multithreading, Polars HPC, and benchmarking frameworks GenAI & Innovation: Embedding AI/ML into pipelines and exploring LLMs for automation Led cross-functional teams delivering critical data platforms under tight timelines. Mentored engineers, creating best practices that became organizational standards. Designed data solutions aligned with business strategies, driving growth. I connect with engineers, leaders, and innovators pushing the limits of data. Whether you’re scaling platforms, integrating GenAI, or building next-gen products, let’s share insights. I write on data engineering, GenAI, and emerging tech. My newsletter “Code & Compute” simplifies complex data architecture and shares real-world lessons. I believe in making data engineering knowledge accessible, helping the community stay current. Always learning and evolving with new trends in data engineering and GenAI. DM me to discuss architecture, GenAI, or collaborations. 📧 ritvik.iitr@gmail.com | topmate.io/ritvik_raj

Experience

10 yrs 11 mos
Total Experience
1 yr 6 mos
Average Tenure
1 yr 9 mos
Current Experience

Blue ridge

Lead Data Engineer

Sep 2024Present · 1 yr 9 mos · Pune, Maharashtra, India · Remote

  • Team: Data Platform Initiative (DPI)
  • About Company: Blue Ridge, a leading AI-powered, cloud-native supply chain management platform serving 220+ global distributors and retailers to optimize forecasting, inventory, replenishment, and integrated business planning (S&OP) while driving operational efficiency.
  • Leading the new Data Platform Initiative (DPI) to modernise legacy systems built in C# and Stored Procedures, leveraging scalable, distributed systems and modern technologies like Spark, Databricks, and cloud-native pipelines. Took ownership from external vendors and delivered end-to-end reliable solutions.
  • Designed and implemented scalable PySpark and Polars pipelines orchestrated with Airflow DAGs, covering end-to-end workflow logic, automation, and analytics for critical supply chain and operational use cases with high availability.
  • Architected the Bronze, Silver, and Gold layers for structured file ingestion and processing, developing generic Databricks jobs to automate ingestion from AWS S3 and building fault-tolerant, scalable pipelines for incoming custom files to support downstream workflows.
  • Led performance benchmarking of Polars, PySpark, and Pandas on millions of records, introduced cost-optimized, high-performance components under tight timelines, and received a monthly award for delivering a critical Polars-based proof-of-concept and use case.
Apache SparkDatabricksSupply Chain ManagementPolarsAmazon Web Services (AWS)Rust (Programming Language)+9

Quarks

Senior Data Engineer

Jul 2023Apr 2024 · 9 mos · Gurugram, Haryana, India · Remote

  • Architected and optimized high-throughput streaming infrastructure serving millions of users, leveraging Spark, Scala, Kafka, and Nifi to reduce data processing latency by 75% and achieve 4x pipeline acceleration through advanced multithreading optimization at RDD, Kafka, and API levels.
  • Led critical system modernization initiative migrating legacy batch processes from Drools to real-time Spark Streaming, delivering 90%+ performance improvement and reducing processing time from 20 minutes to near real-time execution, enabling 3-4x revenue growth for 8+ enterprise clients through enhanced real-time decision-making.
  • Designed scalable, reusable Rule Engine framework with trigger logic for dynamic rule activation/deactivation, reducing development and deployment time by 100% for new rules, eliminating manual development overhead and cutting deployment cycles to integration testing only.
  • Built production-grade streaming pipelines for mission-critical modules including Reporting, Rule Engine, Notification, Notification Retry, and API Calling with comprehensive testing coverage (unit/integration), ensuring enterprise-grade reliability and seamless deployment.
  • Delivered customer-centric solutions through direct stakeholder engagement, translating complex business requirements into technical implementations while maintaining high customer satisfaction through proactive bug resolution.
  • Drove security and infrastructure excellence by upgrading Log4j across 10+ microservices (Spark, Scala, Kafka, MySQL stack), eliminating critical vulnerabilities while maintaining zero downtime and reducing system downtime by 70% through optimized multi-server deployments.
  • Enhanced data platform capabilities through end-to-end pipeline management, Kafka cluster optimization, and comprehensive Nifi automation (setup, version control, CLI), improving network data ingestion efficiency by 20% and overall data flow by 75% while conducting advanced R&D on Kafka rebalancing solutions.
Big DataApache KafkaApache SparkData EngineeringMySQLREST APIs+7

Unravel data

Senior Software Engineer

Oct 2021Mar 2023 · 1 yr 5 mos · Bengaluru, Karnataka, India · Hybrid

  • Team: Innovation and Insights Team (CTO’s Office)
  • About Company: Data observability/actionability platform providing complete visibility and AI-powered optimization for data pipelines across multi-cloud environments. Trusted by Fortune 500 companies including Adobe, Mastercard, and others.
  • As part of this team, responsible for developing innovative prototypes to delight customers and fuel growth.
  • Implemented the Spark insights feature 'gap analysis' within Unity-One and Unravel apps, identifying runtime delays and delivering actionable optimization steps that enhanced Spark job performance analysis. This feature continues to be used by many customers to this day, indirectly contributing to revenue growth through improved customer satisfaction and platform value.
  • Designed and developed an ETL pipeline to collect cluster event data from Databricks using the Clusters API, ingesting into ElasticSearch to generate real-time insights that reduced critical events by over 75%.
  • Refactored and enhanced the Comparator App to support multiple Unravel app versions by accommodating data source and schema changes, increasing customer accessibility by 30%.
  • Automated application deployment across instances using Jenkins pipelines, achieving a 90% reduction in deployment time.
Apache KafkaAzure DatabricksApache SparkPythonPlotlyDASH+9

Jio

Data Engineer

Jul 2019Oct 2021 · 2 yrs 3 mos · Navi Mumbai, Maharashtra, India · On-site

  • Team: Analytics Center of Excellence (CoE)
  • About Company: Jio, India’s leading telecommunications and digital services provider, serving over 500 million subscribers and processing multiple petabytes of data daily through vast, scalable data infrastructure and analytics platforms.
  • Data Engineer (Job Role) (Official Title: Manager) at Analytics CoE, driving big data platform development on JBDL (Jio Big Data Lake), contributing to up to 30% improvement in data processing efficiency.
  • Planned, designed, and managed scalable big data platforms processing petabytes of data from multiple sources by implementing optimized end-to-end data pipelines.
  • Built robust in-house big data capabilities, replacing external vendors and saving 80% of resources.
  • Contributed to the development and use of in-house ingestion, business, and platform frameworks, building multiple data use cases that leveraged these tools for optimized processing and analytics.
  • Developed KPIs using HDFS and Hive, visualized on dashboards to deliver actionable insights that improved network latency of Jio’s cellular and fiber networks by 40%.
  • Deployed and scaled data solutions involving ingestion, preprocessing, aggregation, and visualization across batch, mini-batch, and streaming using Spark, NiFi, Kafka, Scala, Hive, and visualization tools.
  • Scaled infrastructure to handle increased data volumes and pipeline complexity while ensuring high reliability and performance, and applied engineering and data analytics expertise to drive business strategy and data-driven decision-making.
  • Contributed to the upgrade from HDP 2.0 to HDP 3.0, followed by a successful proof of concept (POC) and migration from HDP 3.0 to Cloudera Data Platform (CDP), and supported a hybrid cloud strategy by migrating select workloads from on-premises HDP/CDP clusters to Azure, evaluating performance, scalability, integration, and cloud interoperability to modernize the data platform.
Apache SparkHadoopMicrosoft AzureData ArchitectureApache KafkaApache NiFi+24

Tohoku university

Research Assistant at Computer Vision Lab

Dec 2018Apr 2019 · 4 mos · Sendai, Miyagi, Japan · On-site

  • Completed Master’s Thesis in Computer Vision (Grade: 9/10) on “Detection of components of the building structure (bridges) from their images using deep learning”
  • Researched and implemented convolutional neural networks (CNNs) for edge detection and object detection of bridge components, with the objective of identifying areas prone to cracks.
  • Encountered two main challenges: generation of training data and improvement of detection accuracy. To address the first, generated synthetic images from a 3D bridge model, applied data augmentation, and validated results using real bridge images. For the second challenge, examined the differences between synthetic and real data, identified variations in noise and illumination, and developed methods to minimize their impact, thereby making both domains more consistent for better detection performance.
  • Discovered that incorporating edge maps into detection tasks produced more accurate results than relying solely on original images.
  • Concluded that leveraging synthetic training data greatly reduces time and effort compared to collecting large volumes of real-world data, while still ensuring reliable performance.
Python (Programming Language)Deep LearningTensorFlowPyTorchMachine LearningComputer Vision+2

Uit the arctic university of norway

Summer Research Intern

May 2018Aug 2018 · 3 mos · Tromsø, Troms og Finnmark, Norway · On-site

  • Computer Vision Intern in the Department of Engineering Science and Safety (Automation)
  • Project: "Investigating the role of asymmetry and symmetry operators in visual attention"
  • Applied fundamental mathematical principles to understand underlying mechanisms of visual attention.

Himalayan explorers'​ club,iit roorkee

2 roles

Secretary

Jul 2017May 2018 · 10 mos · Roorkee, Uttarakhand, India

  • HEC is IIT Roorkee's most famous cultural club. It is responsible for organizing various events such as treks, rafting, paragliding, CAT course, cycle race, adventure activities, etc.
  • Around 1500 students participate in all these events held throughout the year. HEC has a committee of approximately 30 students, which manages the club's activities.
  • Secretary of Fitness Conditioning and Training.

Joint Secretary

Jul 2016Apr 2017 · 9 mos · Roorkee, Uttarakhand, India

  • - Joint Secretary of Database Management and Design.

Worldquant llc

Data Analysis and Quantitative Research Intern

Aug 2016Sep 2017 · 1 yr 1 mo · Mumbai

  • Achieved Gold level during Quantitative Research Summer Program.
  • Ranked within ALL INDIA TOP 40 on the leaderboard.
  • Involved in building high sharpe ratio alphas to determine the risk-return profile of equities.
  • Analysed the financial market, worked with basic stock pricing and volume data, performed simulations, interpreted results, and converted ideas to algorithms.
  • Implemented trading strategies in Python using scipy, numpy, scikit-learn, etc to place bets on instruments which are expected to be profitable in long run.

Indian institute of technology, roorkee

5 year Integrated Masters in Applied Mathematics, IIT Roorkee

Jul 2014Apr 2019 · 4 yrs 9 mos · Roorkee, India

  • Master of Thesis Grade: 9.0 (Computer Vision)
  • Some of the courses taken:
  • C++, Data Structures, Design and Analysis of Algorithms, DBMS, Graph Theory, Discrete Mathematics, Linear Algebra, Linear and Non-Linear Programming (Operations Research), Mathematical Statistics, Computer Vision, Digital Image Processing, Statistical Inference, Soft Computing, Numerical Methods, Number Theory, Theory of Computation, Evolutionary Algorithms.
Machine LearningMatlabPython (Programming Language)MathematicsApplied MathematicsImage Processing+2

Education

Indian Institute of Technology, Roorkee

5 year Integrated Masters — Applied Mathematics

Jan 2014Jan 2019

St. Michael's High School, Patna

Higher Secondary Education — PCM

Jan 2011Jan 2013

International School, Patna

Secondary Education

Jan 2000Jan 2011

Stackforce found 100+ more professionals with Data Engineering & Cloud-native Platforms

Explore similar profiles based on matching skills and experience