Prayag Purani

Data Engineer

United States2 yrs 9 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in building scalable AI-driven solutions.
  • Proven track record in data engineering and analytics.
  • Strong background in generative AI and machine learning.
Stackforce AI infers this person is a Data Engineer with expertise in Fintech and AI-driven solutions.

Contact

Skills

Core Skills

Data EngineeringCloud ComputingData QualityTeachingGenerative AiData ScienceAi Development

Other Skills

Apache FlinkGoogle Cloud Pub/SubBigQueryTrinoApache IcebergGreat ExpectationsKubernetesCI/CDPower BITableauGenerative ModelsGANsVAEsDiffusion ModelsPython

About

I am passionate about Machine Learning, AI, and Data Science, leveraging data-driven insights to solve complex problems and drive innovation. Pursuing a Master’s in Data Analytics at San Jose State University, I specialize in A/B testing, inferential statistics, real-time data processing, and generative AI. My experience spans building ETL pipelines, designing predictive models, optimizing real-time analytics, and developing AI-powered solutions. I have worked extensively with Generative AI (GANs, VAEs, Transformers), real-time data streaming (Kafka, PySpark, AWS), and Business Intelligence (Tableau, Power BI, Grafana). As a Teaching & Research Assistant, I mentor students in Deep Learning, AI, and Data Engineering, staying engaged with cutting-edge advancements. My goal is to bridge the gap between AI research and real-world applications, continuously learning, evolving, and building scalable AI-driven solutions. I am always open to exploring new opportunities, collaborating on impactful projects, and driving innovation in ML, LLMs, and AI automation.

Experience

2 yrs 9 mos
Total Experience
1 yr 5 mos
Average Tenure
--
Current Experience

Chime

Data Engineer

Jan 2025Present · 1 yr 4 mos · United States

  • Engineered real-time financial transaction ingestion pipelines using Apache Flink and Google Cloud Pub/Sub, processing nearly twelve
  • million daily card events into BigQuery datasets supporting fraud monitoring and risk analytics.
  • Developed scalable ELT transformations using Trino and Apache Iceberg tables stored on Google Cloud Storage, consolidating twenty fintech transaction sources into standardized datasets for regulatory reporting.
  • Implemented automated data quality monitoring frameworks using Great Expectations, validating financial event schemas and ensuring transaction consistency across streaming pipelines, reducing downstream reconciliation discrepancies by twenty-four percent.
  • Optimized analytical performance of large BigQuery datasets by redesigning partitioning strategies and query patterns, decreasing average execution time for fraud detection analytics workloads by nearly twenty-nine percent.
  • Designed Kubernetes-based containerized workflows integrated with Git-driven CI/CD pipelines, automating deployment of 25+ production data pipelines and reducing engineering release cycles by 40%.
  • Developed curated analytical datasets powering 30+ Power BI and Tableau dashboards, enabling product and risk teams to monitor key financial metrics and improving reporting turnaround by 35%.
  • Implemented data governance frameworks including lineage tracking and metadata cataloging across 50+ financial data pipelines, improving audit traceability and regulatory reporting readiness.
Apache FlinkGoogle Cloud Pub/SubBigQueryTrinoApache IcebergGreat Expectations+6

San josé state university

3 roles

Instructional Student Assistant

Aug 2024Dec 2024 · 4 mos · San Jose, CA

  • I am currently serving as an Instructional Student Assistant for DATA 266: Generative Model Applications at San Jose State University, where I support students and instructors in exploring cutting-edge generative models like GANs, VAEs, and diffusion models. My role involves leading lab sessions, providing one-on-one support, grading assignments, and developing course materials to enhance the learning experience. This position has deepened my expertise in generative AI, improved my teaching and problem-solving skills, and allowed me to share my passion for AI with others. I’m excited to contribute to this transformative field and would love to connect with others interested in generative models, AI, or education.
Generative ModelsGANsVAEsDiffusion ModelsTeachingGenerative AI

Graduate Teaching Assistant

May 2024Dec 2024 · 7 mos · San Jose, CA

  • Designed and implemented a robust ETL workflow to collect, clean, and integrate surname records from US Census and USPTO dataset using Python; performed exploratory data analysis and statistical validation to address data biases and reduce missing values by 30%.
  • Architected classification pipeline, starting with broad ethnicity detection using US Census surname data, then refining predictions for Asian subregions via USPTO datasets, enabling fine-grained demographic insights with enhanced model interpretability and scalability.
  • Designed a hierarchical multi-stage demographic classification system achieving 99% precision in ethnicity prediction from last names, leveraging character-level embeddings, LSTM, and NLP techniques; delivered a Flask web application with real-time and user interface
PythonExploratory Data AnalysisStatistical ValidationNLPLSTMData Engineering+1

Research Assistant

Jan 2024Dec 2024 · 11 mos · San Jose, CA

  • Owned and led the end-to-end development of an automated A/B testing platform powered by LLMs, managing the full lifecycle from system design to deployment and reducing experimentation time and effort by 30–70%.
  • Built an AI-agent experimentation system integrating multi-source analytics pipelines from Google Analytics (GA4) and Meta Ads, automating variant generation, hypothesis creation, and evaluation.
  • Spearheaded data ingestion and integration workflows across marketing platforms, improving data accuracy by 25% and increasing actionable insights by 20% for product and growth teams.
  • Engineered a fine-tuned LLaMA-30B model using Low-Rank Adaptation (LoRA), integrated with a Retrieval-Augmented Generation (RAG) pipeline to generate bias-aware, context-rich summaries of experiment outcomes.
  • Developed Tableau product analytics dashboards to track user behavior, funnel performance, and experiment results, directly supporting data-driven decision-making.
A/B TestingLLMsGoogle AnalyticsMeta AdsTableauData Science+1

Ltimindtree

Data Engineer

Aug 2021Jul 2023 · 1 yr 11 mos · India

  • Built distributed telecom event ingestion pipelines using Apache Kafka and Apache NiFi, capturing nearly eight million daily network records and storing curated datasets within Azure Data Lake Storage.
  • Engineered scalable batch processing workflows using Apache Spark with Scala on Databricks, transforming multi-source telecom usage data and reducing nightly processing latency across enterprise reporting systems.
  • Designed dimensional data warehouse models in Azure Synapse Analytics using star schema architecture supporting 10+ telecom analytics dashboards and improving query performance by 30%.
  • Implemented ELT transformation workflows using dbt and SQL pipelines, integrating over fifteen telecom billing datasets into structured warehouse tables supporting revenue assurance and customer analytics.
  • Developed orchestration workflows using Apache Airflow scheduling and workflow automation, improving reliability of large-scale batch data processing pipelines across multiple telecom operational systems.
  • Implemented Delta Lake lakehouse architecture on Azure storage environments, optimizing partition strategies and improving query performance for high-volume telecom analytical datasets by approximately twenty-six percent.
  • Automated infrastructure provisioning using Docker and Terraform, standardizing deployment across 15+ data engineering environments and reducing environment setup time by 50%.
Apache KafkaApache NiFiAzure Data Lake StorageApache SparkSQLdbt+2

Education

San José State University

Master's degree — Data Analytics

Aug 2023May 2025

Vellore Institute of Technology

Bachelor's degree — Computer Science

Green Valley High School

12th Grade — Science

Stackforce found 100+ more professionals with Data Engineering & Cloud Computing

Explore similar profiles based on matching skills and experience