Vishal S.

Data Scientist

Mumbai, Maharashtra, India6 yrs 11 mos experience
AI EnabledHighly Stable

Key Highlights

  • Expertise in developing advanced LLM systems.
  • Strong background in statistical modeling and data analysis.
  • Proven track record in high-impact projects in fintech.
Stackforce AI infers this person is a Data Scientist with expertise in Fintech and Research domains.

Contact

Skills

Core Skills

Data ScienceMachine LearningData EngineeringCloud ComputingResearch

Other Skills

Amazon Web Services (AWS)Retrieval Augmented GenerationNatural Language ProcessingKubernetesPythonSQLAWSPrometheus/GrafanaTerraformDockerGenerative AIPandasBigQueryFastAPINumPy

About

I am a full-stack Data Scientist with 4.5 years of expertise in LLMs, Devops, and Statistical Modeling. I successfully developed a Retrieval Augmented Generation system for an NBFC. I have modelled data from large physics experiments, namely Belle 2 and CERN to infer properties of Higgs and classify quarks. With a strong blend of theory and practical experience, I believe my interdisciplinary background positions me as a dedicated Data Scientist committed to innovative problem-solving.

Experience

6 yrs 11 mos
Total Experience
2 yrs 3 mos
Average Tenure
2 yrs 4 mos
Current Experience

Morningstar

Data Scientist

Feb 2024Present · 2 yrs 4 mos · Navi Mumbai · Hybrid

  • NL2SQL for Equity Rating DB: Built a maintainable Natural Language → SQL system with 83%
  • evaluation accuracy; presented to C-suite stakeholders.
  • Increased robustness by schema+domain context injection and systematic refinement for quality improvement.
  • (Athena, Azure OpenAI, Pinecone, LangChain, RAG techniques: column-based search, keyword-optimized vector
  • retrieval, few-shot prompting, revision feedback loop)
  • Attribute Resolution Scalability (7→14 million): Designed and executed production-like scalability tests;
  • identified memory ceilings from in-pod caching and provided concrete capacity and autoscaling
  • recommendations. (Kubernetes, Flux, Locust, Prometheus/Grafana, EFK)
  • Attribute Resolution Re-architecture: Rearchitected service to externalize cache from kubernetes pods to
  • an external database to enable reliable performance at 14 million scale; accomplished 85% cost reduction, 97%
  • faster startup, 64% more throughput, less errors, all while maintaining model quality. (Redis in ElastiCache
  • Valkey, Postgres in RDS)
  • Risk Model Smart Text LLM Integration: Productionized infrastructure and deployment workflows for
  • integrating LLM output into Morningstar Direct. (AWS, Harness CI/CD, Terraform, Docker)
  • Investor Pulse Model Analysis: Delivered accuracy metrics and explainable drivers to stakeholders for
  • decision-making on continuation of the product, saving resources on maintenance. (Python, statsmodels)
  • Marketing Analytics Data Foundation: Implemented Google Analytics 3 data backup and performed EDA
  • to support acquisition/churn modeling. (AWS S3, Python, SQL)
  • CI/CD Performance and Platform Maintenance: Reduced CI runtime by 66%; maintained and secured
  • multiple production rating services; migrated Disaster Recovery pipeline and standardized artifacts. (Harness,
  • SonarQube, Terraform, Jenkins→Harness, msnexus→Sonar artifacts)
Amazon Web Services (AWS)Retrieval Augmented GenerationNatural Language ProcessingKubernetesPythonSQL+2

Newron

Machine Learning Engineer

Sep 2023Feb 2024 · 5 mos · Bengaluru · On-site

  • Built a Retrieval Augmented Generation (RAG) system for both the structured and the unstructured parts for the Loan origination dataset.
  • · Getting client requirements and data acquisition for the project, from the technical side.
  • · Executed meticulous data cleaning using Pandas and orchestrated data ingestion into BigQuery for structured queries and Pinecone for unstructured queries.
  • · Engineered prompts utilizing GPT-3.5-turbo, GPT-4-turbo, and LangChain, for efficient retrieval of information in response to natural language queries.
  • · Deployed GET and POST requests using FastAPI, actively contributed concepts to the dashboard. Implemented interactive scatter plots of India using Plotly for enhanced data visualization.
  • · Successfully presented a comprehensive demo of the initial system iteration to the client, ensuring
  • alignment with expectations.
Generative AIRetrieval Augmented GenerationPandasBigQueryFastAPIMachine Learning+1

Tata institute of fundamental research (tifr)

Research Scholar

Aug 2018Oct 2022 · 4 yrs 2 mos · Mumbai Area, India · On-site

  • Belle II Event Classification: Built classifier to separate e+e− → B signal from q ¯q background; compared
  • ANN/SVM/BDT with imbalance-aware tuning; achieved ROC 0.961. (C++, cern root)
  • CERN Higgs Coupling Fits: Modeled couplings using non-linear regression; validated Standard Model
  • consistency via χ2 minimization within 95% confidence interval. (statistical modeling)
  • β-Ga2O3 Structural Characterization: Fit x-ray diffraction peaks using GMM; estimated lattice parameters
  • and coefficient of thermal expansion with error analysis; optimized sample size to meet uncertainty targets. Conducted all lab experiments using x-ray diffraction and electron microscopy.
  • (Gaussian mixture models (GMM), regression)
  • PhD coursework in Physics
NumPyC++Statistical ModelingResearchData Science

Education

Tata Institute of Fundamental Research, Mumbai

Master of Philosophy - MPhil — Physics

Aug 2018Oct 2022

Birla Institute of Technology and Science, Pilani - Goa Campus

Bachelor of Engineering - BE — Mechanical Engineering

Aug 2013Jul 2018

Birla Institute of Technology and Science, Pilani - Goa Campus

Master of Science - MS — Physics

Aug 2013Jul 2018

Stackforce found 100+ more professionals with Data Science & Machine Learning

Explore similar profiles based on matching skills and experience