Himanshu Garg

DevOps Engineer

Bengaluru, Karnataka, India6 yrs 9 mos experience
Highly Stable

Key Highlights

  • Expert in compiler optimizations for AI hardware.
  • Experience with large language models and deep learning frameworks.
  • Strong background in hardware-software co-design.
Stackforce AI infers this person is a specialist in AI hardware optimization and compiler engineering.

Contact

Skills

Core Skills

CompilersMachine Learning

Other Skills

LLVMOptimizationSoftware DevelopmentMicroFormatPytorchLLMsTensorflowONNXCaffe2Node.jsHTMLCSSJavaScriptPythonMATLAB

About

Skilled and interested in Hardware-software co-design, Computer architecture, Machine learning and Digital design. Currently working at Google Bangalore as Silicon Compiler Engineer. Worked on GLOW compiler and system design for next-generation AIC100 hardware. Helping optimize LLMs and Generative models on hardware accelerators.

Experience

6 yrs 9 mos
Total Experience
5 yrs
Average Tenure
1 yr 9 mos
Current Experience

Google

Compiler Software Engineer

Sep 2024Present · 1 yr 9 mos · Bangalore Urban, Karnataka, India · On-site

  • Working on EdgeTPU backend compilation and optimizations for upcoming pixel phones.
  • HW-SW Co-optimization for Gemini and Veo models.
LLVMOptimizationMachine LearningSoftware DevelopmentCompilers

Qualcomm

3 roles

Lead Compiler Engineer

Dec 2023Aug 2024 · 8 mos

  • 1. Working on compression techniques like MicroFormat(MXFP and MXINT).
  • 2. Pytorch 2.0 eager mode support for running LLMs on custom hardware
MicroFormatPytorchLLMsCompilersMachine Learning

Senior Compiler Engineer

Promoted

Nov 2021Nov 2023 · 2 yrs

  • 1. Worked on the system design of next-generation AIC Accelerators
  • 2. Working on Backend compiler optimizations like operator scheduling, memory allocation and Tiling.
  • 3. Optimizing models like LLMs. Generative models and CodeGen.
Machine LearningSoftware DevelopmentCompilers

Machine Learning Engineer

Jul 2019Nov 2021 · 2 yrs 4 mos

  • Working on the support of different DL frameworks like Tensorflow, ONNX, Caffe2 on AIC100
  • Optimized kernel development on Hexagon Vector DSP for object detection algorithms
  • Development of various DL operators supported by existing frameworks
  • Post-training quantization techniques for achieving floating-point like accuracy.
TensorflowONNXCaffe2Machine Learning

Adobe

Product Development Intern

May 2017Jul 2017 · 2 mos · Noida Area, India

  • Objective: Extracting images from a video with higher aesthetic-value using deep learning.
  • Developed a web page using Node.JS(HTML+CSS+JavaScript) for labelling of the collected images.
  • Pre-processed data to convert it in required LMDB and HDF5 data format using a python script. Converted previously available SqueezeNet to regression by adding Euclidean loss, Concatenation and silence layer and finetuned it with ImageNet model using GPU enabled PyCaffe.
  • The total size reduced from 35MB to 5MB, computation time from 0.4s to 0.2s with same MSE=0.02
Node.jsHTMLCSSJavaScriptPython

Creative technology workshop

Winter Intern

Dec 2016Jan 2017 · 1 mo · Mumbai Area, India

  • Objective: R&D of speech transformation techniques using signal processing
  • ◦ Detected word onset and offset in a sentence using FFT energy detection technique using MATLAB
  • ◦ Word vowel detection by calculating FFT energy derivative maxima using Gaussian filter derivative
  • ◦ Performance evaluation of the system using software like Praat and Audacity
MATLAB

Education

Indian Institute of Technology, Bombay

Master of Technology - MTech

Jan 2014Jan 2019

Stackforce found 100+ more professionals with Compilers & Machine Learning

Explore similar profiles based on matching skills and experience