Naresh Kumar

CEO

Noida, Uttar Pradesh, India12 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Led the design of a Unified Customer Data Platform.
  • Achieved significant cost savings through data platform optimizations.
  • Open-sourced a comprehensive data quality framework.
Stackforce AI infers this person is a Data Engineering expert in E-commerce and Open Source domains.

Contact

Skills

Core Skills

Data ArchitectureData EngineeringData Quality

Other Skills

API DevelopmentAmazon Web Services (AWS)Analytical SkillsAnalyticsApache AirflowApache KafkaApache SparkArchitectureBatch ProcessingBig DataBigTableBusiness Intelligence (BI)CassandraClient ManagementCommunication

About

I am a Principal Architect at Tokopedia, the leading e-commerce platform in Indonesia and a subsidiary of GoTo Group, the largest technology group in the region. With over 11 years of experience in data engineering and architecture, I have a strong background in designing and implementing data platform solutions that enable hyper-personalization, data discovery, and data quality across multiple business units and domains. In my current role, I am working on envisioning and overseeing the Unified Customer Data Platform for four subsidiaries of GoTo Group, which will remove the need for siloed technologies and enhance the customer experience on the e-commerce journey. I am also guiding and leading the implementation of the Domain Data Platforms (Data Mesh) framework, which will provide decentralized and self-service data access and governance for each business unit. Additionally, I am responsible for various cost-saving and optimization initiatives for the data platform, which have resulted in significant savings of over 200k/month. My passion for data drives me to explore new technologies, contribute to the open-source community, and share my knowledge with others. I have built and open-sourced a data quality framework using Spark and Pandas in Python, which offers custom validation checks, data profiling, and anomaly detection. I also hold a Certificate in Advanced Business Analytics from the Indian School of Business, where I learned how to apply data science and machine learning techniques to solve business problems. I am always eager to learn, collaborate, and innovate in the fast-paced and evolving field of data.

Experience

12 yrs 9 mos
Total Experience
3 yrs 2 mos
Average Tenure
6 yrs 7 mos
Current Experience

Tokopedia

3 roles

Principal Architect - Data Platform

Promoted

Jan 2023Present · 3 yrs 5 mos · Noida, Uttar Pradesh, India · On-site

  • Currently working on envisioning and designing Unified Customer Data Platform for multiple subsidiaries (4) of GoTo Group. This will remove need of siloed technologies for hyper personalization of customer behaviors on ecommerce journey
  • Envisioned and architected a high level structure of Domain Data Platforms (Data Mesh) for business units. Currently overseeing the implementation with a team of 28 data engineers
  • Guiding and overseeing the implementation of Unified Feature store considering data governance, security, PII policies etc
Data EngineeringData ArchitectureData GovernanceCustomer Data Platform

Technical Architect - Data Platform

Promoted

Jan 2021Dec 2022 · 1 yr 11 mos · Noida, Uttar Pradesh, India · On-site

  • Envisioned and architected high level Unified Feature Store for entire org replacing need of siloed data and modeling pipelines. Realizing potential savings of 25k/month
  • Lead various cost saving initiatives for data platform which helped in saving 165k/month including data lifecycle, tech optimizations of Cloud usage, architectural improvements of various services
  • Designed high-level and low-level workflow of in house data quality, data discovery solutions namely Metis and Data Catalog respectively
  • Implemented Metis (Data Quality) and Data Catalog (Data Discovery) framework in a team of 20
  • Open-sourced data quality framework dq-whistler with features like custom validation checks, data profiling, and Kaggle like a visual representation of basic data metrics
  • Collaborated with other business units to onboard their use cases on Data Platform. This helped in reducing the siloed infrastructure and enabled a better platform standardization
  • Designed & Implemented Customer360 & Merchant360 API with the help of a team of 5. API is now serving 15k RPS and 6 business units internally, acting as a single source of all user & merchant(seller) features
  • Written various guidelines documents on Confluence for Github, Databases sharing, API naming conventions, Monitoring & Alerting, Application Logging, and many other technical practices
Data QualityData DiscoveryAPI DevelopmentCost OptimizationData Engineering

Lead Data Engineer

Sep 2019Dec 2020 · 1 yr 3 mos · Noida, Uttar Pradesh, India · On-site

  • Designed & Implemented personalization services like Segmentation, User journey platforms in a team of 6, serving 500 million notifications every month to 30 million MAU. This replaced MoEngage with an inhouse framework resulting in cost savings of $400k annually
  • Designed & Implemented a streaming framework for capturing app (android/ios), web platform events, and writing into DWH. This framework includes an API, to capture & publish events to the messaging queue and a streaming pipeline responsible for reading and writing the elements to DWH. The API is
  • serving 45-50k RPS during flash sales and the streaming pipeline is processing ~3.5TB of data and ~3B events on a daily level
  • Implemented a batch framework for data pipelines, Nexus. It supports fully customizable pipelines with custom transformations. The sources and destinations can be picked from either Bigquery, Bigtable, Raw files, Relational databases, Cloud Storage (S3, Google Bucket). Any source can be paired with any destination
Personalization ServicesStreaming FrameworkData PipelinesData Engineering

Airtel digital

Lead Data Engineer

May 2018Sep 2019 · 1 yr 4 mos · Gurugram, Haryana, India · On-site

  • Implemented a complaint management event processing system (Apache Spark Streaming) responsible for capturing customer complaints as events and forwarding them to respective departments on the basis of the decision tree. This reduced the overall turnaround time from 7 days to 2 days.
  • Implemented batch Spark jobs to process 200TB of daily CDR (Call Data Records) data
  • Implemented an API (Django) to visualize the network towers location data for a better understanding of regions having weak signal strength and can be a potential for improvement
  • Coordinated a team of 12 data engineers
Event ProcessingBatch ProcessingAPI DevelopmentData Engineering

Innovaccer

2 roles

Senior Data Engineer/Engineering Manager

Promoted

Jan 2017May 2018 · 1 yr 4 mos · Noida, Uttar Pradesh, India · On-site

  • Coordinated a team of 8 Software Engineers to lead the Customer Success division of the organization. Handling 4 clients in parallel and ensuring the tech data support for those clients.
  • Managed onsite (USA) data operations for 2 clients. Spent 8 months at the different client locations (USA) in understanding the nature of patient-centric business. It was an attempt to better understand the data manageability, discovery, quality, and observability practices adopted by clients and potential upsell opportunities (for inhouse data ETL products)
Client ManagementData OperationsData Engineering

Data Engineer

Apr 2015Dec 2016 · 1 yr 8 mos · Noida, Uttar Pradesh, India · On-site

  • Developed a dashboard using Python to showcase trading data for one of our clients
  • Developed an initial prototype of data quality framework in Spark Scala
  • Worked closely with our client to set up their search engine Infrastructure using ELK stack
Dashboard DevelopmentData Quality FrameworkData Engineering

Mu sigma inc.

Data Engineer

Jun 2013Apr 2015 · 1 yr 10 mos · Bengaluru, Karnataka, India · On-site

  • Developed Rstash package for the R language. This package replaced the Rgithub package in the
  • Rcloud platform (An AT&T platform built on top of the R language)
  • Developed the search engine using Apache Solr for the Rcloud platform
  • Contributed to rediscc an R package and actively collaborated with Simon Urbanek, one of the maintainers of the R language for macOS
  • Organized hands-on sessions on R language for other teams
R ProgrammingSearch Engine DevelopmentData Engineering

Education

Indian School of Business

Certificate in Advanced Business Analytics — Business Analytics

Feb 2016Feb 2017

Army Institute of Technology (AIT), Pune

Bachelor of Engineering - BE — Electrical and Telecommunications Engineering

Jun 2009Jun 2013

Kendriya Vidyalaya

Senior Secondary Certificate — Science

Apr 2007Apr 2008

Kendriya Vidyalaya

Secondary Certificate

Apr 2005Apr 2006

Stackforce found 100+ more professionals with Data Architecture & Data Engineering

Explore similar profiles based on matching skills and experience