Prasanna Biswas

Lead ML Engineer

Bengaluru, Karnataka, India7 yrs 8 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in optimizing AI systems for performance and accuracy.
  • Co-authored papers for prestigious conferences in AI.
  • Developed patented algorithms enhancing compiler efficiency.
Stackforce AI infers this person is a Machine Learning Engineer specializing in AI systems and performance optimization.

Contact

Skills

Core Skills

Generative AiAi SystemsGpu ProgrammingLarge Language Models (llm)Machine LearningNatural Language Processing (nlp)

Other Skills

PyTorchInference-time optimizationsTritonContinuous Integration and Continuous Delivery (CI/CD)KinetoKernel OptimizationC++Heterogeneous ProgrammingSYCL & DPC++Unit TestingResearch and Development (R&D)Team LeadershipONNXData StructuresAlgorithms

About

I am an Advisory Research Engineer at IBM Research – India Lab with close to five years of experience in machine learning, deep learning, and AI systems. My work lies at the intersection of AI algorithms, system-level optimizations, and high-performance computing. At Intel, I developed high-performance GPU kernels with SYCL for Falcon Shores architecture, optimized performance-critical operators, and explored advanced ML research combining VAEs with Diffusion Models, co-authoring papers submitted to CVR 2025 and IEEE CONNECT 2025. Previously at Qualcomm, I optimized ONNX models for NLP, CV, and LLMs on the AI100 accelerator and contributed to custom node fusion operations for inference acceleration. I hold a Master’s in Computer Science from IIT Bombay, where my research focused on multimodal meta-learning for sarcasm and emotion analysis. My expertise spans deep learning for NLP, CV, and LLMs, GPU kernel optimization, AI systems, and bridging research with real-world performance.

Experience

7 yrs 8 mos
Total Experience
1 yr 7 mos
Average Tenure
9 mos
Current Experience

Ibm

Advisory Research Engineer

Sep 2025Present · 9 mos · Bengaluru, Karnataka, India · Hybrid

  • Working on
PyTorchInference-time optimizationsGenerative AITritonContinuous Integration and Continuous Delivery (CI/CD)Kineto+1

Intel corporation

AI Software Solutions Engineer

Jan 2024Aug 2025 · 1 yr 7 mos · Bengaluru, Karnataka, India · Hybrid

  • Developed high-performance kernels for deep learning operators on the upcoming Intel GPU using SYCL.
  • Worked on enhancing the software stack by creating an optimized graph in C++ to handle complex operations and utilizing MLIR.
Kernel OptimizationC++Heterogeneous ProgrammingSYCL & DPC++Unit TestingGPU Programming+1

Qualcomm

2 roles

Senior Machine Learning Engineer

Promoted

Nov 2022Jan 2024 · 1 yr 2 mos

  • Spearheaded ONNX optimizations for NLP and CV models on Qualcomm’s AI100 accelerator, achieving a notable performance boost for large language models (LLMs).
  • Doubled the efficiency of NLP transformer decoder models by implementing key onnx optimizations, and caching Key-Value matrices.
  • Developed a Graph Neural Network algorithm to enhance compiler efficiency, resulting in a filed patent.
  • Led a three-member team in optimizing and deploying models from Hugging Face on AIC 100.
Generative AIResearch and Development (R&D)Team LeadershipONNXInference-time optimizationsLarge Language Models (LLM)+1

Machine Learning Engineer

Nov 2020Nov 2022 · 2 yrs

  • Worked on ONNX optimizations for NLP (Natural Language Processing) and CV (Computer Vision) models for faster inference on Qualcomm’s AI100 accelerator.
  • Designed and implemented software modules for Artificial Intelligence/Deep Neural Network frameworks and tools in C++ & Python automating general (ONNX / TF's forzen) graph optimizations.
  • Implemented auto-detection of post processing part for Image classification, and object detection models, and replaced it with optimized kernels to improve the accuracy of the model during quantization.
  • Implemented Graph algorithms for sorting nodes and removing unused nodes in a graph for faster inference.
  • Deployed models of different ML frameworks (PyTorch, TensorFlow, ONNX) for cloud/ edge use-cases.
Data StructuresAlgorithmsNatural Language Processing (NLP)Computer VisionPython (Programming Language)Deep Learning+6

Indian institute of technology, bombay

3 roles

Research Assistant

Promoted

Aug 2020Nov 2020 · 3 mos · Mumbai, Maharashtra, India

  • While working as Research Assistant, I developed a multi-modal approach to understanding emotions in Sarcasm.
  • The work touches upon the incongruities of the language. The hidden emotions behind a sentence. In text-to-text, my focus has been on problems such as emotion classification, sentiment analysis, and understanding sarcasm.
  • Worked in joint collaboration of IBM and IIT Bombay on Understanding emotions in Sarcasm.
  • Trained a transformer based architecture for leveraging the relation between video, audio and textual features. Proved that emotion information was necessary to identify sarcasm more precisely. Experiments with emotion information had 15.6% better performance.
Natural Language Processing (NLP)Python (Programming Language)PyTorchDeep LearningData ScienceSpeech Processing+1

Teaching Assistant

Promoted

Dec 2019Jun 2020 · 6 mos

  • Assisting Professor in Embedded Systems Course
Data StructuresPyTorchDeep LearningC++Team LeadershipC (Programming Language)

Teaching Assistant

Jul 2018Dec 2019 · 1 yr 5 mos

  • Assisted Professor in the course Computer Programming and Utilisation
Algorithms

Pgcult - culturals at iit bombay

Cultural Secretary

Jun 2019Jun 2020 · 1 yr

Team Leadership

Education

Indian Institute of Technology, Bombay

Master's degree — Computer Science

Jan 2018Jan 2020

Vivekanand Education Society's Institute Of Technology

Bachelor of Engineering - BE — Computer Engineering

Jan 2014Jan 2018

SMT. CHM College, Ulhasnagar - 03

Higher Secondary Education — Science and Mathematics

Jan 2012Jan 2014

Stackforce found 100+ more professionals with Generative Ai & Ai Systems

Explore similar profiles based on matching skills and experience