Aniket Malpure — Product Engineer

I am a master’s student in Applied Data Science at the University of Florida, with a strong foundation in machine learning, natural language processing, and data engineering. I am actively seeking internship opportunities where I can apply my skills to solve real-world challenges and contribute to innovative projects. My experience as a Data Engineer at Bajaj Finance Ltd. allowed me to design and implement solutions like fuzzy name matching algorithms using deep LSTM Siamese networks, automate ETL pipelines with Azure Data Factory, and build ensemble models to improve verification accuracy — all contributing to increased efficiency and impactful business outcomes. Currently, as a Graduate Research Assistant at Twister Lab, I am working on building a Speech Emotion Recognition and Voice Analysis System using wav2vec2 and pre-trained models from Audeering and SpeechBrain. I am also gearing up to explore emotion analysis through eye-tracking, which will further strengthen my research and analytical skills. Beyond work experience, I have developed hands-on projects including real-time stock data streaming pipelines using Kafka and AWS, a Retrieval-Augmented Q&A chatbot with LangChain and Groq, and a football analysis system leveraging YOLO and OpenCV. I am proficient in Python, SQL, TensorFlow, PyTorch, Azure, AWS, Power BI, Docker, and a range of machine learning and data visualization tools. With strong problem-solving abilities and a passion for continuous learning, I am excited to contribute to teams working on impactful data-driven solutions.

Stackforce AI infers this person is a Data Engineering and Machine Learning specialist in the Fintech and SaaS sectors.

Location: Gainesville, Florida, United States

Experience: 2 yrs 1 mo

Skills

Data Engineering
Cloud Migration
Natural Language Processing
Speech Recognition
Data Integration
Data Quality
Machine Learning
Computer Vision
Data Analysis
Data Automation

Career Highlights

Designed fuzzy name matching algorithms using deep learning.
Built a Speech Emotion Recognition system enhancing faculty productivity.
Developed real-time stock data streaming pipelines using Kafka and AWS.

Work Experience

Orlando Utilities Commission (OUC - The Reliable One)

Data Engineering Intern (1 yr 1 mo)

Twister Lab, University of Florida

Graduate Research Assistant (6 mos)

Bajaj Finserv

Data Engineer (1 yr)

Data Engineering Intern (6 mos)

Pune Institute of Computer Technology

Undergraduate Research Assistant (10 mos)

Babel

Data Analyst Intern (9 mos)

Education

Master's degree at University of Florida

Bachelor's degree at Pune Institute of Computer Technology

Aniket Malpure

Product Engineer

Gainesville, Florida, United States2 yrs 1 mo experience

AI EnabledAI ML Practitioner

Key Highlights

Designed fuzzy name matching algorithms using deep learning.
Built a Speech Emotion Recognition system enhancing faculty productivity.
Developed real-time stock data streaming pipelines using Kafka and AWS.

Stackforce AI infers this person is a Data Engineering and Machine Learning specialist in the Fintech and SaaS sectors.

Contact

Skills

Core Skills

Data EngineeringCloud MigrationNatural Language ProcessingSpeech RecognitionData IntegrationData QualityMachine LearningComputer VisionData AnalysisData Automation

Other Skills

Oracle DatabaseDDL scriptsAWS S3SnowflakeTalendwav2vec2AudeeringSpeechBrainPostgreSQLAzure Data FactoryPower BIELT pipelinesJaro-WinklerPhonic algorithmsCNN

About

Experience

2 yrs 1 mo

Total Experience

1 yr

Average Tenure

1 yr 1 mo

Current Experience

Orlando utilities commission (ouc - the reliable one)

Data Engineering Intern

May 2025 – Present · 1 yr 1 mo · Orlando, Florida, United States · On-site

Prepared an inventory of 150+ Oracle tables and 40+ reports impacted by the assigned project, ensuring comprehensive coverage of dependencies.
Documented DDL scripts for all inventoried tables, providing clear technical specifications to accelerate Talend pipeline creation.
Developed ELT workflows in Talend to extract from Oracle, stage in AWS S3, and load into Snowflake, enabling a seamless on-prem to cloud migration.
Performed transformations in Snowflake across 5 layered environments, enhancing data quality and standardizing schemas for downstream consumption.
Optimized reporting by designing UML-based multidimensional data models, reducing query complexity and cost versus direct SQL-based reporting.

Oracle DatabaseDDL scriptsAWS S3SnowflakeTalendData Engineering+1

Twister lab, university of florida

Graduate Research Assistant

Sep 2024 – Mar 2025 · 6 mos · Gainesville, Florida, United States · Hybrid

Proposed a Speech Emotion Recognition and Voice Analysis System using the state-of-the-art wav2vec2 model to map emotions with empathy in a classroom environment, enhancing faculty productivity.
Implemented pre-trained models from Audeering and SpeechBrain to extract voice features from 20+ faculty member’s audio responses, building a robust dataset and improving emotional analysis accuracy by 5%.

wav2vec2AudeeringSpeechBrainNatural Language ProcessingSpeech Recognition

Bajaj finserv

2 roles

Data Engineer

Jul 2023 – Jul 2024 · 1 yr · Pune, Maharashtra, India · On-site

Architected PartnerLending Product on Demand (POD) service with third-party vendors, enabling seamless loan disbursement and generating ~$1M in new loans.
Reengineered Document POD system with PostgreSQL, cutting digital agreement delivery latency by 25% for 5k+ monthly agreements.
Built robust ELT pipelines using Azure Data Factory to integrate PostgreSQL data, improving data availability and reducing latency.
Streamlined reporting workflows by automating data pipelines and Power BI dashboards, cutting manual effort by 100%.
Maintained comprehensive technical documentation and Git-managed version control to support production-grade deployments

PostgreSQLAzure Data FactoryPower BIELT pipelinesData EngineeringData Integration

Data Engineering Intern

Jan 2023 – Jul 2023 · 6 mos · Pune, Maharashtra, India · On-site

Engineered PostgreSQL schemas and functions for EMI POD extension, boosting query performance by 20% and scaling to 3+ business verticals.
Optimized identity resolution in PostgreSQL using Jaro-Winkler and Phonic algorithms, improving name-matching precision by 11%.
Partnered with development and business teams to align schema changes with evolving requirements, ensuring scalability and data integrity.

PostgreSQLJaro-WinklerPhonic algorithmsData EngineeringData Quality

Pune institute of computer technology

Undergraduate Research Assistant

Jul 2022 – May 2023 · 10 mos · Pune, Maharashtra, India · On-site

Built a Plant Disease Classifier using CNN with an accuracy of 95.62%, precision of 94.38%, and recall of 93.60%, enabling effective disease detection in plants.
Discovered transfer learning techniques with VGG19 and ResNet to compare performance results, optimizing model selection for higher accuracy.

CNNtransfer learningVGG19ResNetMachine LearningComputer Vision

Babel

Data Analyst Intern

Oct 2021 – Jul 2022 · 9 mos · Remote

Automated 10+ extraction and reporting processes from Matomo analytics via SQL in DBeaver, enhancing the efficiency of insight generation.
Streamlined raw web data processing workflows by integrating Matomo logs into actionable analytics dashboards.
Utilized advanced SQL techniques to restructure queries and leverage appropriate joins, boosting data extraction speed by 15% in Oracle Cloud.

SQLDBeaverMatomoData AnalysisData Automation