Vedanth V Baliga

Data Engineer

Bengaluru, Karnataka, India1 yr 9 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Led development of a major data platform for crypto trading.
  • Expert in building scalable data pipelines with Databricks.
  • Strong collaboration with cross-functional business units.
Stackforce AI infers this person is a Fintech Data Engineer specializing in scalable data architectures and real-time analytics.

Contact

Skills

Core Skills

Data EngineeringApache SparkMachine Learning

Other Skills

Azure DatabricksTrading SystemsData ModelingDatabricksPySparkAzure Data Lake StorageRedpanda KafkaPowerBIAzure DevOpsSQLPythonDeep LearningPython (Programming Language)TensorFlowNatural Language Processing (NLP)

About

Iโ€™m a Data Engineer at StoneX, a global financial services firm, with 2 years of experience designing, building, and maintaining end-to-end data systems that power analytics, reporting, and real-time decision-making. My core expertise lies in building scalable data pipelines and architectures using modern tools like Databricks, Apache Spark, and Redpanda. I specialize in implementing Lakehouse architecture using Databricks, where Iโ€™ve worked extensively with Delta Lake for reliable storage and Delta Live Tables (DLT) to orchestrate production-grade data pipelines with built-in quality checks, monitoring, and versioning. In addition to Spark-based batch and streaming workloads, I work with Redpanda, a high-performance Kafka-compatible streaming platform, to power low-latency event-driven systems. This enables us to support real-time data ingestion, transformations, and alerts across financial datasets. I use Python and SQL to write clean, modular ETL code, while Power BI is my go-to tool for turning raw data into insightful dashboards for business stakeholders. ๐Ÿ”ง ๐‚๐จ๐ซ๐ž ๐“๐จ๐จ๐ฅ๐ฌ & ๐“๐ž๐œ๐ก๐ง๐จ๐ฅ๐จ๐ ๐ข๐ž๐ฌ: ๐Ÿ“Œ๐ƒ๐š๐ญ๐š๐›๐ซ๐ข๐œ๐ค๐ฌ: Lakehouse architecture, Delta Live Tables, Delta Lake, ML integration ๐Ÿ“Œ๐€๐ฉ๐š๐œ๐ก๐ž ๐’๐ฉ๐š๐ซ๐ค (๐๐ฒ๐’๐ฉ๐š๐ซ๐ค): Distributed data processing and transformation ๐Ÿ“Œ๐‘๐ž๐๐ฉ๐š๐ง๐๐š: High-throughput streaming pipelines ๐Ÿ“Œ๐๐ฒ๐ญ๐ก๐จ๐ง: Scripting, transformation logic, API interaction ๐Ÿ“Œ๐’๐๐‹: Data Modeling, querying, and optimization ๐Ÿ“Œ๐๐จ๐ฐ๐ž๐ซ ๐๐ˆ: Business intelligence, reporting, interactive dashboards Iโ€™m passionate about building data systems that are not just functional, but resilient, observable, and business-aligned. I value clean architecture, data reliability, and continuous learning. Feel free to connect if you're working in data engineering, analytics, or building modern data platforms!

Experience

1 yr 9 mos
Total Experience
1 yr 9 mos
Average Tenure
1 yr 9 mos
Current Experience

Stonex group inc.

3 roles

Data Engineer II

Jan 2026 โ€“ Present ยท 4 mos ยท Bengaluru, Karnataka, India ยท Hybrid

Data Engineer I

Jul 2024 โ€“ Dec 2025 ยท 1 yr 5 mos ยท Bengaluru, Karnataka, India ยท Hybrid

  • ๐๐ซ๐จ๐ฃ๐ž๐œ๐ญ : ๐”๐ง๐ข๐Ÿ๐ข๐ž๐ ๐ƒ๐š๐ญ๐š ๐๐ฅ๐š๐ญ๐Ÿ๐จ๐ซ๐ฆ ๐…๐จ๐ซ ๐’๐ญ๐จ๐ง๐ž๐— ๐ƒ๐ข๐ ๐ข๐ญ๐š๐ฅ ๐‹๐‹๐‚
  • ๐ŸŸฆ ๐“๐ž๐œ๐ก ๐’๐ญ๐š๐œ๐ค
  • ๐Ÿ“Œ Databricks, PySpark, Azure Data Lake Storage, Redpanda Kafka and PowerBI.
  • ๐ŸŸฆ ๐€๐œ๐ก๐ž๐ข๐ฏ๐ž๐ฆ๐ž๐ง๐ญ๐ฌ
  • ๐Ÿ“Œ Tasked and led the development and deployment of a major data engineering initiative for ๐’๐ญ๐จ๐ง๐ž๐— ๐ƒ๐ข๐ ๐ข๐ญ๐š๐ฅ ๐‹๐‹๐‚, a newly established institutional-grade digital assets platform. This platform now supports institutional crypto trading operations generating over $3 ๐ฆ๐ข๐ฅ๐ฅ๐ข๐จ๐ง ๐ข๐ง ๐ญ๐ซ๐š๐๐ข๐ง๐  ๐ซ๐ž๐ฏ๐ž๐ง๐ฎ๐ž ๐๐š๐ข๐ฅ๐ฒ ๐ฐ๐ข๐ญ๐ก ๐ก๐ฎ๐ง๐๐ซ๐ž๐๐ฌ ๐จ๐Ÿ ๐ญ๐ก๐จ๐ฎ๐ฌ๐š๐ง๐๐ฌ ๐จ๐Ÿ ๐ญ๐ซ๐š๐๐ž๐ฌ ๐ฉ๐ฅ๐š๐œ๐ž๐.
  • ๐Ÿ“Œ Designed and implemented the Lakehouse architecture using Databricks, structuring raw and curated data layers following the medallion architecture model:
  • ๐“๐ก๐ข๐ฌ ๐ข๐ง๐œ๐ฅ๐ฎ๐๐ž๐ ๐๐ž๐Ÿ๐ข๐ง๐ข๐ง๐  ๐š๐ง๐ ๐ฆ๐จ๐๐ž๐ฅ๐ข๐ง๐  ๐Ÿ๐š๐œ๐ญ ๐š๐ง๐ ๐๐ข๐ฆ๐ž๐ง๐ฌ๐ข๐จ๐ง ๐ญ๐š๐›๐ฅ๐ž๐ฌ ๐Ÿ๐จ๐ซ ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ฉ๐ฅ๐ž ๐ค๐ž๐ฒ ๐Ÿ๐ข๐ง๐š๐ง๐œ๐ข๐š๐ฅ ๐๐จ๐ฆ๐š๐ข๐ง๐ฌ ๐ฌ๐ฎ๐œ๐ก ๐š๐ฌ ๐“๐ซ๐š๐๐ž๐ฌ, ๐๐จ๐ฌ๐ข๐ญ๐ข๐จ๐ง๐ฌ, ๐๐š๐ฅ๐š๐ง๐œ๐ž๐ฌ, ๐š๐ง๐ ๐๐จ๐ฌ๐ญ๐ข๐ง๐ ๐ฌ from multiple external systems including Talos Trading and Fireblocks. I collaborated closely with data consumers and analysts to identify business-critical KPIs and data contracts, ensuring high data accuracy and discoverability across the Lakehouse.
  • ๐Ÿ“Œ Played a central role in building and maintaining ETL pipelines for orchestration, applying schema enforcement, change data capture, and built-in quality checks. This allowed the team to manage pipeline lineage, recovery, and performance with confidenceโ€”while supporting continuous delivery of data with near real-time latency.
  • ๐Ÿ“Œ Collaborated cross-functionally with multiple business units including Risk, Dealing Operations, Compliance, and Finance, acting as the technical point of contact for the data engineering team. My work involved translating complex requirements into scalable, production-grade data solutions.
Apache SparkAzure DatabricksData EngineeringTrading SystemsData Modeling

Data Engineering Intern

Jan 2024 โ€“ Jun 2024 ยท 5 mos ยท Bengaluru, Karnataka, India ยท Hybrid

Azure DevOpsAzure DatabricksSQLPython

Nokia

NBUC Machine Learning Intern

Apr 2023 โ€“ Sep 2023 ยท 5 mos ยท Greater Bengaluru Area ยท Remote

Deep LearningPython (Programming Language)TensorFlowNatural Language Processing (NLP)

Belyntic gmbh

Machine Learning Engineer Intern

Jan 2023 โ€“ Apr 2023 ยท 3 mos ยท Berlin, Germany ยท Remote

  • ๐Ž๐ซ๐ ๐š๐ง๐ข๐ณ๐š๐ญ๐ข๐จ๐ง : ๐Ž๐ฆ๐๐ž๐ง๐š-๐๐ž๐ฅ๐ฒ๐ง๐ญ๐ข๐œ
  • ๐๐ซ๐จ๐ฃ๐ž๐œ๐ญ : ๐ƒ๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ข๐ง๐  ๐๐จ๐ฏ๐ž๐ฅ ๐’๐ž๐ฅ๐Ÿ-๐š๐๐ฃ๐ฎ๐ฏ๐š๐ง๐ญ ๐•๐š๐œ๐œ๐ข๐ง๐ž ๐Ÿ๐จ๐ซ ๐๐ซ๐จ๐ ๐ซ๐ž๐ฌ๐ฌ๐ข๐ฏ๐ž ๐Œ๐ฎ๐ฅ๐ญ๐ข๐Ÿ๐จ๐œ๐š๐ฅ ๐‹๐ž๐ฎ๐ค๐จ๐ž๐ง๐œ๐ž๐ฉ๐ก๐š๐ฅ๐จ๐ฉ๐š๐ญ๐ก๐ฒ
Deep LearningComputer VisionNatural Language Processing (NLP)

Omdena

Junior Machine Learning Engineer

Dec 2022 โ€“ Jan 2023 ยท 1 mo ยท Marseille, Provence-Alpes-Cรดte d'Azur, France ยท Remote

  • ๐๐ซ๐จ๐ฃ๐ž๐œ๐ญ: ๐’๐จ๐œ๐ข๐š๐ฅ ๐‡๐š๐ญ๐ž ๐š๐ง๐ ๐‚๐ฒ๐›๐ž๐ซ C๐ซ๐ข๐ฆ๐ž ๐š๐ง๐ ๐Ž๐Ÿ๐Ÿ๐ž๐ง๐ฌ๐ข๐ฏ๐ž ๐“๐ž๐ฑ๐ญ ๐ƒ๐ž๐ญ๐ž๐œ๐ญ๐ข๐จ๐ง ๐ข๐ง ๐…๐ซ๐ž๐ง๐œ๐ก ๐’๐จ๐œ๐ข๐š๐ฅ ๐๐ž๐ญ๐ฐ๐จ๐ซ๐ค ๐ƒ๐š๐ญ๐š
  • ๐Š๐ž๐ฒ ๐“๐š๐ฌ๐ค๐ฌ ๐š๐ง๐ ๐‘๐ž๐ฌ๐ฉ๐จ๐ง๐ฌ๐ข๐›๐ข๐ฅ๐ข๐ญ๐ข๐ž๐ฌ
  • (1) Worked on building continuous updating DAG pipelines in Airflow to ingest French Twitter data using the Twitter API integrated with the data flow architecture.
  • (2) Leveraged RedditAPI and Facebook Graph API to extracts comments from over 100k posts and ingested data to PostgreSQL.
  • (3) The final dataset consisted of over 50,000 comments and posts in French. I applied various RegEx techniques to clean the dataset. This data was then preprocessed and lemmatized using 4 different parsers utilizing the Spacy Library.
  • (4) Created User and Developer Docs for the Data Engineering Team specifically for the Data Ingestion scripts.
  • (5) Research and Implementation of Zero Shot models like BART for auto labelling French Twitter Data with an accuracy of 85% which is commendable because many of these Zero Shot Models were built for English language data.
  • (6) Developed Apache Hadoop Data Ingestion pipelines for massive and large scale Social Network data.
  • (7) Implemented Dense Convolution Models with multiple dense layers with an F1 Score of 0.66
  • (8) Utilized and Implemented DistillBERT models from the HuggingFace Transformers Library on the French dataset augmented and annotated using the Zero Shot BART Approach with accuracy of ~90% and F1 Score of 0.78
  • (9) Fine Tuned the BERT Models with Pytorch Lightning and built a learning rate scheduler with custom learning rate, pre training stop late and early stopping. This helped the team to calculate an accurate number of epochs to achieve the best performance metrics.
  • (10) Implemented a variety of custom loss functions from various research papers like DICE, ADAM, Black ADAM and SGD.
  • (11) Implemented a TensorBoard dashboard and chart to visualize the model outputs and interpret the results.
Machine LearningPyTorchTensorFlowHadoopApache AirflowNatural Language Processing (NLP)+1

Proxmaq

Computer Vision Developer

Oct 2020 โ€“ Feb 2021 ยท 4 mos ยท Bangalore Urban, Karnataka, India

  • I am working on a project called ProxVision that aims to build an AI powered Device that would help the blind communicate.
  • Some of the Technologies that I am working on are:
  • 1. Python Web Scraping using Selenium and Scrappy
  • 2. Data Annotation using Labellimg
  • 3. Object Detection and Image Processing using OpenCV.
  • Here is the website with complete details of the project:www.prox-gen.com

Skillbasics

YouTube Content Creator

Oct 2020 โ€“ Dec 2020 ยท 2 mos ยท Remote

  • I am creating programming tutorial videos for the Skillbasics YouTube Channel with an aim to improve my soft skills and video production knowledge.
  • Link to the channel: https://www.youtube.com/channel/UCb9AfI-bsXibgECEzyT7Bdw/featured

Education

Dayananda Sagar Institutions

Bachelor's degree โ€” Computer Science & Engineering(Specialization in Data Science)

Jan 2020 โ€“ Jan 2024

Deeksha Center For Learning

Class 12-Intermidiate โ€” Computer Science

Jan 2018 โ€“ Jan 2020

Clarence Public School

ICSE โ€” High School/Secondary Diplomas and Certificates

Jan 2006 โ€“ Jan 2017

GEMS Education

Jan 2005 โ€“ Jan 2006

Stackforce found 100+ more professionals with Data Engineering & Apache Spark

Explore similar profiles based on matching skills and experience