Shrey Jain

Co-Founder

Pittsburgh, Pennsylvania, United States6 yrs 9 mos experience
AI ML PractitionerHighly Stable

Key Highlights

  • Developed innovative synthetic data generation pipeline.
  • Led key feature development at Salesforce for Fortune 500 clients.
  • Founded a startup focused on blockchain finance management.
Stackforce AI infers this person is a Data Scientist with a strong focus on AI and Blockchain technologies.

Contact

Skills

Core Skills

Synthetic Data GenerationLlm Fine-tuningDeep LearningData ScienceSoftware DevelopmentGenerative AiProduct DevelopmentBlockchainC++

Other Skills

Business AnalyticsCascading Style Sheets (CSS)Computer ArchitectureComputer VisionCustomer AcquisitionD3.jsDeep Neural Networks (DNN)Digital MarketingGoogle SuiteHTMLJavaJavaScriptLarge Language Models (LLM)LeadershipMLOps

About

A budding Computational Data Scientist who aspires to develop and apply deep learning techniques such as LLMs to solve real-world problems. Amongst my broad interests, I'm currently focused on pushing the frontiers of safety reasoning through LLMs. I've previously worked on a new paradigm for watermarking synthesized audio during generation and not after it. I keep up with the latest developments in multimodal LMs, post-training for LLMs and synthetic data generation to expand performance boundaries. In what now seems like a previous life, I was a software developer applying primarily computer engineering skills to develop software at a High Frequency Trading firm and then at Salesforce. Alongside a full-time job, I participated in hackathons and even tried founding a startup in the blockchain world that went down with Bitcoin's nosedive in 2022.

Experience

6 yrs 9 mos
Total Experience
1 yr 9 mos
Average Tenure
--
Current Experience

Tiktok

Machine Learning Research Intern

May 2025Aug 2025 · 3 mos · San Jose, California, United States · On-site

  • Delivered Synthetic data generation pipeline for training fine-ranking model in TikTok Search Ads recommendation.
  • Increased AUC and accuracy by 3-4% applying high-quality CoT generation and SFT (supervised fine-tuning)
  • approach from Li et al., 2025 (FAIR), and Group Relative Policy Optimization (GRPO) from DeepSeekMath.
  • Developed a scalable data curation pipeline through prompt engineering for frontier LLMs for high-quality labeling.
  • Devised full fine-tuning pipelines for upto 70B parameter LLMs using DeepSpeed ZeRO-3, vLLM.
  • Achieved 98% teacher performance with SLMs (Qwen2 0.5B-1.5B) distilled from fine-tuned Qwen3-8B teacher.
Synthetic Data GenerationLLM Fine-tuningReinforcement LearningRecommender Systems

Carnegie mellon university - school of computer science - language technologies institute

Teaching Assistant (Intro to DL, FCDS, DSS)

Dec 2024Aug 2025 · 8 mos · Pittsburgh, Pennsylvania, United States · On-site

  • TA for
  • 11-785 - Introduction to Deep Learning by Professor Bhiksha Raj and Professor Rita Singh
  • 11-637 - Foundations of Computational Data Science by Professor Kemal Oflazer
  • 11-631 - Data Science Seminar by Professor Maarten Sap
Deep LearningData Science

Plus - personalized learning squared

Math tutor

Oct 2024Dec 2024 · 2 mos · Pittsburgh, Pennsylvania, United States · Remote

Stealth startup

Founder

Aug 2022Nov 2022 · 3 mos

  • Built a finance management software for blockchain native organizations, DAOs (Decentralized Autonomous Organizations) taking the product from my hackathon projects (at Polygon's BUIDL IT and Hedera's blockchain hackathon in 2022) to real-world DAOs.
  • Brought in 3 customers for a pilot but had to shut shop due to lack of funding and other prior commitments.
Product DevelopmentProduct MarketingCustomer AcquisitionBlockchain

Salesforce

Member of Technical Staff

Jan 2022Jul 2024 · 2 yrs 6 mos · Bengaluru, Karnataka, India · Hybrid

  • Led development of key features that empower multiple F500 companies and bring millions in revenue for Salesforce.
  • Developed and designed Data models and Backend architecture for parts of the Actionable Segmentation product
  • Delivered a summary generation tool for Life Sciences team, leveraging Salesforce Einstein and OpenAI LLMs.
Software DevelopmentJavaSystems DesignJavaScriptGenerative AI

Plutus research private limited

Software Developer

Jun 2021Jan 2022 · 7 mos · India

  • Improved nanosecond latency trading systems based on C++, leveraging OS and Network level optimizations.
  • Delivered a multi-core PnL report generation system capable of processing terabytes of daily trade logs in milliseconds. • Engineered multi-tiered distributed data storage strategy to extract insights and minimize repeated processing.
Software DevelopmentC++Python (Programming Language)Computer ArchitectureOperating SystemsShell Scripting

Undostres

Software Development Intern

Jan 2021Feb 2021 · 1 mo · Mexico City, Mexico

  • ● Tools & languages: Amazon Web Services(CloudWatch), Docker, PHP, MySQL, JavaScript
  • ● Learned & developed web products using PHP tech stack, also learned their hosting on AWS servers
  • ● Redesigned & documented new beta feature testing, reduced requests to the server during page load by ~15x

Collegespace

3 roles

President

Promoted

Aug 2020Apr 2021 · 8 mos

  • Led the innovation & management of tech products, that help thousands of students of NSIT every day, leveraging the latest frameworks like react.js and flutter.

Content Writer

Aug 2018Aug 2020 · 2 yrs

Web Developer

Sep 2017Aug 2020 · 2 yrs 11 mos

Energy web

Blockchain Developer Intern

Aug 2020Nov 2020 · 3 mos · Zug, Switzerland

  • ● Tools & languages: Python, D3.js, Parity Ethereum client, SQLite
  • ● Developed an ETL pipeline for transaction data, created a plan of action for data dashboarding using Google Data Studio
  • ● Analyzed over 12 million transactions to develop an algorithm to detect the true owner of EWT (a cryptocurrency)
  • ● My projects laid foundational step towards monitoring the EWChain to help improve its functionalities

180 degrees consulting nsut

3 roles

President

May 2020May 2021 · 1 yr

Project Leader

Jun 2019May 2020 · 11 mos

Consultant

Mar 2019Jun 2019 · 3 mos

Evalueserve

Intern

May 2019Jul 2019 · 2 mos · Gurgaon, Haryana, India

  • Intern in the corporate and professional services division.
  • I worked as part of a research team for one of the world's top 3 management consulting firms
  • Worked on 10+ research tasks for case teams of one of the top 3 management consulting firms in the world
  • Automated a task through excel VBA, completing an 8-hour job in a few minutes; Recognized by my team’s AVP for it

Worldquant

Virtual Research Consultant

Mar 2019Oct 2019 · 7 mos

  • Developed 21 Alphas or quantitative trading algorithms that outperform the market for mid-frequency trading on top 3000 liquid US stocks

Education

Carnegie Mellon University

Master of Science - MS — Computational Data Science

Aug 2024Dec 2025

Netaji Subhas Institute of Technology

Bachelor of Engineering — Computer Engineering

Aug 2017Jun 2021

Delhi Public School - R. K. Puram

Computer Science

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Synthetic Data Generation & Llm Fine-tuning

Explore similar profiles based on matching skills and experience