Deepa Pandey

Software Engineer

Bengaluru, Karnataka, India4 yrs 7 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Expert in Generative AI model optimization.
  • Proficient in building scalable Python APIs.
  • Experience with diverse AI model backends.
Stackforce AI infers this person is a Backend-heavy Fullstack Engineer specializing in AI and machine learning technologies.

Contact

Skills

Core Skills

Generative AiPython

Other Skills

AlgorithmsCascading Style Sheets (CSS)CeleryCitrix ProductsData StructuresDjango REST FrameworkDockerFastAPIGitGitHubHTML5JavaScriptMicrosoft SQL ServerONNXPandas

About

Currently working on Generative AI, contributing to optimization, preparation, conversion, and execution of multiple foundation models (including LLaMA variants and others). Work involves model optimization, format conversion, quantization, and enabling efficient on-device inference workflows across Android and Snapdragon-based targets.Previously worked on the QAIRT Development Python API, migrating functionality from existing QAIRT CLI tools into structured, Python-native workflows. Built APIs covering model conversion, quantization, ahead-of-time compilation, execution, and analysis, enabling end-to-end execution of ML and GenAI models without direct CLI usage. Contributed to support for ONNX, PyTorch, TFLite, and GGUF models, execution across CPU, GPU, HTP, DSP, and AIC backends, and clear separation between host-side model preparation and target-side runtime execution across Linux, Windows, Android, and Snapdragon devices.Earlier, worked on Qualcomm AI Studio, building backend systems in Python using FastAPI. Developed REST APIs and real-time execution workflows using Redis, Celery, message queues, and WebSockets. Contributed to systems that run, convert, optimize, and quantize models (including ONNX), and prepare them for on-device deployment across Android, Windows, Linux, x86, and multiple hardware targets, in a fully Dockerized environment integrated with internal CI/CD pipelines.

Experience

Qualcomm

Software Engineer (Qualcomm AI Studio)

Nov 2022Present · 3 yrs 4 mos · Bengaluru, Karnataka, India · On-site

  • Generative AI Platforms
  • Working on model preparation and on-device execution for multiple GenAI models on Snapdragon hardware.
  • Contribute to model conversion, optimization, quantization, compilation, and runtime execution.
  • Deliver Jupyter notebook–based workflows for customers to evaluate and run models across targets.
  • QAIRT (Qualcomm AI Runtime) – Python APIs
  • Led migration of QAIRT CLI tools into Python-native APIs, enabling end-to-end AI workflows without direct CLI usage.
  • Built APIs for model conversion, quantization, compilation, execution, and analysis.
  • Enabled execution across CPU, GPU, HTP, DSP, and AIC backends for ONNX, PyTorch, TFLite models.
  • Qualcomm AI Studio – Backend Engineering
  • Built backend services in Python (FastAPI), designing REST APIs and real-time execution workflows.
  • Implemented Redis, Celery, message queues, and WebSockets for scalable model execution pipelines.
  • Enabled model execution, ONNX conversion, optimization, and on-device deployment across multiple platforms and hardware targets.
Generative AIPythonFastAPIONNXPyTorchTFLite+4

Tata consultancy services

2 roles

Systems Engineer

Jul 2021Oct 2022 · 1 yr 3 mos · India

Intern

Jan 2021Jun 2021 · 5 mos · India

Education

Govt. College of Engg. and Textile Technology, Serampur 110

Bachelor of Technology - BTech — Computer Science

Jan 2016Jan 2020

Stackforce found 100+ more professionals with Generative Ai & Python

Explore similar profiles based on matching skills and experience