Aviral Bhardwaj

Data Engineer

Bengaluru, Karnataka, India9 yrs 3 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • 7 years of experience in Big Data Engineering.
  • Certified Databricks Data Engineer Professional.
  • Led successful data migrations for global clients.
Stackforce AI infers this person is a Data Engineering specialist in the SaaS and Cloud Computing sectors.

Contact

Skills

Core Skills

DatabricksData EngineeringAws

Other Skills

Data Build Tool (DBT)PySparkPython (Programming Language)Amazon Web Services (AWS)SQLApache AirflowAzure DatabricksTerraformData ArchitectsMicrosoft AzureAzure Data FactoryData LoadingAmazon S3GitHubGitlab

About

Experienced Data Engineer and open-source contributor with 7 years devoted to Big Data Engineering, specializing in Databricks, AWS, PySpark, SQL, ETL, and modern cloud architectures. As DE, I design robust SaaS solutions optimizing data pipelines and orchestration across insurance, retail, and healthcare sectors.Certified as a Databricks Data Engineer Professional, Spark Associate, ML Associate, Data Analyst, and Data Engineer Associate—showcasing my expertise and commitment to continuous growth. I have hands-on experience with Databricks, Snowflake, Unity Catalog, Delta Lake, MLflow, Postman, Fivetran, SAP HANA, ADF, and Azure. Adept in Python, Git, version control, and Airflow orchestration on AWS EC2, I’ve built resilient systems for both batch and streaming use cases, tackling historical and incremental data workflows.At Coforge Limited and GMG International, I led teams and delivered high-stakes projects, such as migrating data from AWS RDS to Redshift and SQLServer, integrating Fluent Commerce API for retail OMS, and executing seamless data migrations from SAP HANA to Data Lakes. My leadership includes building and mentoring high-performance teams, pioneering divisions, and collaborating directly with business users to create gold-standard data views in Unity Catalog.Previous engagements with global clients like Coca-Cola (Azure ADF pipelines, Databricks-PySpark transformation) and Amgen Inc. (cost-optimized AWS-Databricks solutions, open-source Airflow orchestration, and Databricks E2 migrations) highlight my end-to-end command of data platforms and ability to drive operational efficiency and cost savings.As a Udemy instructor with 25,000+ students, I transform technical learning into accessible knowledge. My open-source projects include contributions to AWS, Microsoft’s MarkitDown, Unity Catalog for Linux Foundation, all reinforcing my passion for collaboration and technological advancement.With postgraduate credentials in Big Data Analytics (CDAC Pune) and a background in Electronics & Telecommunication Engineering, I fuse domain knowledge with a pragmatic approach to deliver real business value through every project, platform, and partnership. If you seek expertise in Databricks, cloud data engineering, SaaS, or want to modernize your data infrastructure, let’s connect!

Experience

9 yrs 3 mos
Total Experience
1 yr 9 mos
Average Tenure
7 mos
Current Experience

Confidential

Databricks Data Engineer

Oct 2025Present · 7 mos · India · Remote

  • https://deltalakestudio.com/ this is open source project that I am developing it will be the nocode-lowcode platform
DatabricksData Build Tool (DBT)

Databricks

3 roles

Databricks Preferred Partner DPP

Oct 2024Present · 1 yr 7 mos · Remote

  • As a Databricks preferred partner we guide some clients on their migration strategy with databricks end to end deployment.

Senior Data Engineer

Sep 2024Feb 2025 · 5 mos · Remote

Python (Programming Language)PySparkDatabricksAmazon Web Services (AWS)

Databricks Community Contributor

Jan 2020Oct 2025 · 5 yrs 9 mos · Remote

  • As a Databricks Community Contributor, I actively assist data professionals in resolving complex challenges related to Databricks. I regularly engage with community members, share solutions, and contribute to discussions to support the broader data engineering community.
  • You can view my profile here:
  • https://community.databricks.com/t5/user/viewprofilepage/user-id/53460
  • I am also recognized as a featured member in the Databricks Community:
  • https://community.databricks.com/s/feed/0D58Y00009oyBOmSAM
PySparkPython (Programming Language)Databricks

Github

Open Source Contributor

Aug 2024Present · 1 yr 9 mos · Bengaluru, Karnataka, India · Remote

  • I am actively contributing to the open source community, focusing on projects that drive innovation in data engineering and cloud technologies. My GitHub profile showcases my ongoing work and collaborations: github.com/aviral-bhardwaj.
  • Some of the notable projects I contribute to include:
  • MarkItDown by Microsoft: A popular Python package that converts a wide range of file formats-including Office documents, images, audio, and web data-into Markdown, making it a powerful tool for content conversion and LLM training.
  • Unity Catalog by Databricks: The industry’s first open source catalog for unified data and AI governance, enabling organizations to manage and secure data assets across clouds and data platforms. Unity Catalog is now open-sourced, fostering collaboration and integration within the community.
  • Open Source Projects by AWS: I engage with AWS-driven open source initiatives, which include a variety of developer tools and cloud-native projects such as the AWS Cloud Development Kit, Bottlerocket, Firecracker, and OpenSearch, all aimed at improving cloud development and operational excellence.
  • I am passionate about building and collaborating on cloud-based data solutions, and I am always open to new opportunities to contribute and learn within the open source ecosystem
PySparkPython (Programming Language)SQLApache AirflowDatabricks

Coforge

Lead Data Engineer

Jul 2024Oct 2025 · 1 yr 3 mos · Hyderabad, Telangana, India · On-site

  • I played a key role in implementing advanced data engineering solutions at Coforge Limited, focusing on Databricks and AWS technologies.
  • Developed data pipelines that provided cost usage notifications, enhancing stakeholder awareness.
  • Collaborated with the Fivetran team to integrate audit logs into S3 storage for improved data governance.
  • Successfully migrated Databricks jobs across various environments, ensuring smooth deployment and operational continuity.
DatabricksPySparkPython (Programming Language)SQL

Trinet

Lead Data Engineer

Jul 2024Oct 2025 · 1 yr 3 mos · Hyderabad, Telangana, India · Remote

  • Implemented Databricks products including Unity Catalog, Service Principals, and Cluster Policies.
  • Developed data pipelines to provide cost usage notifications to key stakeholders.
  • Collaborated with Fivetran team to integrate audit logs into S3 storage.
  • Successfully migrated Databricks jobs across development, test, and production workspaces, ensuring seamless deployment.
  • Collaborating with a major HR client (Trient) to implement Databricks solutions.

Drivewealth

Databricks Data Engineer

Jan 2024Jul 2024 · 6 mos · Gurugram, Haryana, India · Remote

  • Collaborated with the Drivewealth team to migrate the Hive Metastore data to Unity Catalog.
  • Analyzed the existing architecture and recommended a new approach for deploying solutions using Terraform.
  • Migrated Databricks jobs across development, testing, and production workspaces for smoother deployment.
Azure DatabricksAmazon Web Services (AWS)TerraformDatabricks

Gmg

Data Engineer

Jan 2024Jul 2024 · 6 mos · Gurugram, Haryana, India · Remote

  • Successfully completed the GMG Retail Order Management System (OMS) project within deadlines, utilizing Fluent Commerce API.
  • Developed data pipelines in Unity Catalog across bronze, silver, and gold layers, ensuring data integrity and accessibility.
  • Pioneered the Data Engineering division at GMG's Technology Center, leading a team of five to enhance data infrastructure.
Data ArchitectsPython (Programming Language)PySparkAzure DatabricksMicrosoft AzureAzure Data Factory+3

Packt

Book Reviewer

Jan 2024Feb 2024 · 1 mo · Gurugram, Haryana, India · Remote

  • Book Review: Databricks Certified Associate Developer for Apache Spark Using Python -I recently reviewed the book Databricks Certified Associate Developer for Apache Spark Using Python: The Ultimate Guide to Getting Certified in Apache Spark Using Practical Examples with Python. This comprehensive guide is designed to help readers prepare for the Databricks certification exam by providing practical examples and clear explanations using Python. The book covers essential Spark concepts, hands-on exercises, and exam-focused tips, making it a valuable resource for anyone aiming to become a certified Spark developer.
  • For more details, you can find the book at Packt Publishing:
  • https://www.packtpub.com/en-in/product/databricks-certified-associate-developer-for-apache-spark-using-python-9781804619780
DatabricksPySpark

Fanduel

Databricks Data Engineer

Nov 2023Jan 2024 · 2 mos · Bengaluru, Karnataka, India · Remote

  • Worked with the FanDuel Insight Engineering team to deploy Databricks products such as Unity Catalog, Service Principals, Cluster Policies, and Data Pipelines for cost usage notifications to stakeholders.
  • Collaborated with the Fivetran team to ingest Fivetran audit logs into S3.
  • Worked closely with the Tableau team on data ingestion and multiple token-related issues.
  • Extensively used Terraform to simplify and deploy Databricks products and data pipelines.
Azure DatabricksDatabricks

Lovelytics

Databricks Data Engineer

Nov 2023Jan 2024 · 2 mos · Bengaluru, Karnataka, India · Remote

Amazon Web Services (AWS)Amazon S3GitHubGitlabAmazon Elastic MapReduce (EMR)Amazon Athena+4

Knowledge lens - a rockwell automation company

2 roles

Senior Data Engineer

Apr 2021Jan 2024 · 2 yrs 9 mos · Remote

  • In my role at Knowledge Lens, I spearheaded the transition of on-premise databases to cloud solutions for major clients like Coca-Cola and Amgen Inc. I developed robust data pipelines using Azure Data Factory and AWS services, ensuring efficient data transformation and validation across various medical streams. My work significantly improved data processing capabilities while maintaining high security and cost-efficiency.
DatabricksDatabricks Products

Data Engineer

Sep 2019Oct 2023 · 4 yrs 1 mo · Remote

Python (Programming Language)Azure DatabricksLinuxSQLPySparkDatabricks+1

Amgen

Data Engineer

Mar 2020Jan 2023 · 2 yrs 10 mos · Bengaluru, Karnataka, India · Remote

  • Developed robust data pipelines for oncology, respiratory, and COVID-19 medical streams.
  • Designed ETL workflows utilizing AWS Glue and Databricks with PySpark and Python.
  • Implemented open-source Airflow on AWS EC2 for efficient orchestration and validation using SQL.
  • Automated AWS infrastructure management through Lambda functions, enhancing security policies.
Apache SparkManagementAmazon Web Services (AWS)Python (Programming Language)LinuxBig Data+3

Udemy

Udemy Instructor

Jan 2020Mar 2020 · 2 mos · Bengaluru South, Karnataka, India · Remote

  • I have recently started teaching on Udemy, dedicating my weekends to creating courses that help me enhance my skills while sharing knowledge with others. My current offerings include:
  • Python for Data Engineering: This course covers Python from the basics to advanced concepts, focusing on logic building, data structures, object-oriented programming, exception handling, and more. It is designed for engineering students and software professionals aiming to strengthen their Python skills for data engineering roles.
  • Databricks Associate Developer for Apache Spark: This course is tailored for those preparing for the Databricks certification, providing practical insights and hands-on examples with Apache Spark.
  • You can explore these courses here:
  • Python Course: https://www.udemy.com/course/python-for-data-engineering/
  • Spark Course: https://www.udemy.com/course/databricks-associate-developer-for-apache-spark/
  • I look forward to helping learners advance their careers in data engineering and big data technologies.
Data ArchitectsPython (Programming Language)Amazon Web Services (AWS)Azure DatabricksMicrosoft AzureAzure Data Factory+1

Astrazeneca

Data Engineer

Sep 2019Mar 2020 · 6 mos · Bengaluru, Karnataka, India · Remote

  • Developed and maintained secure, cost-efficient data pipelines utilizing AWS services including EC2, S3, and EMR.
  • Conducted thorough data validation to ensure accuracy and reliability of data loads.
  • Leveraged Databricks functionalities to enhance data processing efficiency and performance.
DatabricksPySparkSQLAmazon Web Services (AWS)

Freejobaware.com

Founder

Jul 2016Jan 2019 · 2 yrs 6 mos · India

  • Worked As a Website Developer, Search Engine Optimization Head And Content Writer at FreeJobAware.
  • Managed 8 People in Team For 2 Years.
  • Worked in Area on On-page and Off-Page Optimization.
  • Developed Website For NGO Astha Seva Sansthan Kotdwar.
  • Developed Website For IBharatNews a News Portal for Uttrakhand.
Amazon Web Services (AWS)Amazon S3GitHubGitlabAmazon Elastic MapReduce (EMR)Amazon Athena+4

Education

Centre for Development of Advanced Computing (C-DAC)

Post Graduate Diploma In Big Data Analytics — Big Data Analytics

Jan 2019Jan 2019

RustamJi Institute Of Technology

Bachelor of Engineering (BE) — Electronics & Communication

Jan 2012Jan 2016

Sarswati Vidhya Mandir Kotdwar

Science( Non Medical ) — Maths

Jan 2010Jan 2012

Government Inter College Adhariyakhal

10th

Jan 2008Jan 2009

Stackforce found 100+ more professionals with Databricks & Data Engineering

Explore similar profiles based on matching skills and experience