Sunil Patel

CEO

Mumbai, Maharashtra, India10 yrs 8 mos experience

Most Likely To SwitchAI Enabled

Key Highlights

10 years of experience in Deep Learning and AI.
Contributed to NVIDIA's growth to a three trillion-dollar company.
Expert in optimizing and scaling Deep Learning models.

Stackforce AI infers this person is a Deep Learning and AI expert with a focus on scalable solutions in the tech industry.

Contact

snlpatel001213@hotmail.com LinkedIn

Skills

Core Skills

Deep LearningModel OptimizationCuda ComputingArtificial Intelligence

Other Skills

AI-Based Banking SolutionsASR DevelopmentAccelerated ComputingAgile MethodologiesAmazon Web Services (AWS)Apache SparkArtificial Intelligence (AI)Artificial Neural NetworksCC++CSSCUDAClinical Trial LinkingCloud ComputingComputer Vision

About

A part of "TAC Team" of 24 people help clocking 1.5 Billion Dollars a year and counting. With 10 years of deep experience in Deep Learning, I have witnessed firsthand the remarkable transformation of NVIDIA from a company valued at a few hundred billion to a three trillion-dollar powerhouse, driving the AI revolution. My journey began with Conversational AI, where I contributed to developing and deploying advanced language models for a variety of applications. At NVIDIA, I have been deeply involved in optimizing and scaling Deep Learning models, focusing on large-scale training and inference workloads. My work includes model optimization, profiling, writing optimized operations, and ensuring scalable deployments over GPUs. I’ve also gained hands-on experience with LLM fine-tuning and RAG workflows, enabling AI datacenters to handle massive LLM training runs. Now, I am dedicated to helping customers realize AI-driven datacenters capable of running these cutting-edge models at scale, pushing the boundaries of what’s possible in AI.

Experience

10 yrs 8 mos

Total Experience

2 yrs 8 mos

Average Tenure

6 yrs 10 mos

Current Experience

Nvidia

4 roles

Conglomerates & Industries | Manager Solutions Architecture and Engineering

Promoted

Mar 2025 – Present · 1 yr 2 mos

Data Scientist - IV Deep Learning

Jun 2021 – Present · 4 yrs 11 mos

1) LLM Training and model optimization
LLM Finetuning: Worked on finetuning Mistral 70B and LLaMa-2/3 7/8/70B for triplet extraction for building graph RAG on financial Data.
LLM Model optimization: Optimizing Mistral, Mixtral, and LLaMa and scaling the deployment over large pool of GPUs.
RAG Pipeline: Building RAG pipleines on various modalities like text, Images, Video and docs for HR bot, Marketting bot, Code Documetnation, Product etc..
2) Deep Learning inference over 1000s of live camera
Architecting platform for large-scale Intelligent Video Analytics
Model training on multiple GPU, Multiple Node with Mixed Precision. Trained and deployed end-to-end pipeline for Face Detection and Recognition, ANPR, and Intruder Detection for a small object over a long distance.
I have worked on, Model Training, model profiling, model ensembles, pipeline profiling, component optimization, scaling, and deployment over Kubernetes.
I have worked with optical flow and tiling-based approaches to make the DNN model compute-friendly.
Model optimization, Ensemble with different backend/models, Model/application profiling, writing end to end C++/python Deepstream Application, and deployment with Kubernetes orchestration.
3) ASR - Indian language support
Developed standard procedure for ASR and helped customers to develop ASR for multiple Indic languages. Quartznet-15x5 was used, final WER over IITM data was 6.4.
4) CUDA: My profile doesn’t demand core CUDA skills but out of my interest, I learned CUDA. This helps in many ways:
Deepstream postprocessors require custom libraries for each model to do box filtration
Offloading CPU-intensive processes of the model such as ROI filtering and NMS to GPU.
Reading existing codes and modifying them for TensorRT custom layers.
I understand CUDA code at Grid, Block, and Thread levels and can write parallel kernels.

LLM TrainingModel OptimizationDeep Learning InferenceKubernetesModel ProfilingFace Detection+5

Data Scientist - III Deep Learning

Nov 2019 – Jun 2021 · 1 yr 7 mos

Unreal Engine | Autonomous Driving | Computational linguistics | Computer vision | GPU Acceleration
CUDA Computing:
1) Efficiency improvement by explicit memory assignment
2) Partitioning schemes such as interleaved and block partitioning
3) Compressing data using CSR, ELL, and COO techniques, Cuda Parallelism on the compressed data.
4) Utilizing Unified memory APIs for faster prototyping
Nvidia DeepStream
1) Writing an application for object detection and event initialization
2) Application profiling and fine-tuning
3) Large scale deployment of such applications with Kubernetes
Nvidia Tensorrt
Model conversion from ONNX, Pytorch, or Tensorflow to FP32, FP16, and INT-8 optimization level
Multi-Language ASR, and TTS
1) Development of models training NEMO API for Quartz-Net and Jasper with lightning-PyTorch
2) Optimizing models using Tensorrt and deploying using triton inference server
3) Supported language - Hindi, Gujarati, Punjabi
Deployment at scale
1) Deploying dockerized application over Kubernetes
2) Grounds up application development that takes advantage of multiple GPU-multiple node environment while deployed.
Talks:
0. GPU Technology Conference session - Building Indic ASR Using NVIDIA NeMo and Deploying Models Using Jarvis : https://gtc21.event.nvidia.com/media/1_kz17xdau
1. Nvidia - Accelerated Database Query Using GPU : https://info.nvidia.com/india-accelerated-database-query-reg-page.html?ondemandrgt=yes
2. IISER Pune - Nvidia Data Science Ecosystem Tools : https://www.iiserpune.ac.in/events/Workshop+on+Data+Science+Ecosystem+Tools
3. Intel and Analytics Vidhya - Representing Language Mathematically : https://www.innoplexus.com/news/the-convergence-of-big-data-and-machine-learning/

CUDA ComputingNvidia DeepStreamNvidia TensorRTMulti-Language ASRDeployment at ScaleDeep Learning

Solutions Architect - Deep Learning

Jul 2019 – Nov 2019 · 4 mos

Unreal Engine | Autonomous Driving | Computational linguistics | Computer vision | GPU Acceleration

Innoplexus

Data Scientist - Deep Learning

Nov 2017 – Jun 2019 · 1 yr 7 mos · Eschborn, Germany

1) Document Comparison
Detecting syntactic and semantic similarity between two documents also detect
Insertion, deletion, and Modifications.
Tools/Technology: Skip-thought Sentence vectors, Siamese Networks, Pytorch, Nvidia
V100
2) Primary/Secondary Clinical Trial Linking
Linking clinical trials as primary or secondary by comparing the content of clinical
trials with millions of research papers.
Tools/Technology: Skip-thought Sentence vectors, Siamese Networks, Pytorch, Nvidia

Document ComparisonClinical Trial LinkingDeep Learning

Gvk

Senior Research Associate - Deep Learning

May 2016 – Nov 2017 · 1 yr 6 mos · Hyderabad Area, India

1) Developed Highly Scalable Named Entity Resolution utilizing GPU Computing -
Character level LSTM
Completed and delivered a general-purpose framework for any kind of Named Entity
Resolution (NER) problem. The solution uses state of art ensemble model of
Convolutional Network and Long Short-Term Memory (LSTM), Deployed with on
Tensorrt Ver-2 with custom layers of unidirectional LSTM

Named Entity ResolutionGPU ComputingDeep Learning

Tata group

Software Engineer (Artificial Intelligence R&D)

Aug 2015 – May 2016 · 9 mos

1) AI-Based Banking Transaction Reconciliation
Bank of Switzerland wanted to solve their complex Ledger - Statement reconciliation
problem. Used 1D-CNN. Completed the entire project from R&D to the production-
ready solution using Keras and Python.
2) Conversational Interface
For internal IT service enhancement and as a part of Ignio (TCS's IT Cognitive System
for enterprise IT Ops) Completed a project on building a conversational system using
Natural Language Processing utilizing Word2Vec, H2O and Python.

AI-Based Banking SolutionsConversational InterfaceArtificial Intelligence

Supercomputing facility

Internship

May 2014 – Jul 2014 · 2 mos · IIT-Delhi

A software development on "Protein-protein interaction prediction for dimer formation on the basis of chemical properties and geometrical factors using Deep Learning".