Kumar Vemuri

Head of Design

Reading, England, United Kingdom26 yrs experience
AI EnabledAI ML Practitioner

Key Highlights

  • Over 25 years in semiconductor engineering management.
  • Led architecture for multiple billion-dollar revenue products.
  • Expertise in AI and GPU technologies.
Stackforce AI infers this person is a Semiconductor and AI technology leader with extensive experience in GPU architecture.

Contact

Skills

Core Skills

Technology LeadershipProduct ManagementEngineering Leadership

Other Skills

Chip ArchitectureArtificial Intelligence (AI)SemiconductorsASICHardware ArchitectureSoCVerilogFPGAEmbedded SystemsVLSISystemVerilogEDASoftware DevelopmentICRTOS

About

A semiconductor executive with over two and half decades of Senior Engineering Management, and Technology Leadership & Product management experience. • Engineering Management (organisational, cross-functional and project management) and Technology Leadership on several products that generated $B+ revenues. • 22 years of experience in Engineering Management, Architecture and Design of Highly parallel, Multi-threaded, ultra-low-power GPUs and AI Accelerators. • Engineering-Management, Architecture and Design of processor cores of 8 generations of Intel Integrated GPUs till Skylake • Engineering-Management, Architecture and Design of the first-generation embedded GPU architecture (from ground-up) at ThinCI (now Blaize) • 6 years of experience in building high-performance accelerator cores/IP for Deep-Learning / AI and Computer-Vision • Conceived two different AI / Deep-Learning Accelerator Architectures - A configurable fixed-function architecture for FPGAs and an OpenCL-extended massively-parallel processor architecture • Strong understanding of Hardware-Software interface; experience in Software driver architecture (for 3D Graphics / OpenCL, AI, Deep-Learning). • Experience in building performance and power models. • In-depth understanding of major 3D Graphics and GPGPU APIs. • Architecture, Strategy, Pathfinding and Road-map planning experience. • Experience in start-up fund-raising • Bootstrapped and built advisory and founding teams for start-up. • Experience in driving early tech-validation with partners / customers (Integrators / OEMs) • Pre-Sales partner / customer engagements. • Industry relationships with leading EDA and Design Services vendors. • Early-Stage Start-up and Large Corporation experience.

Experience

26 yrs
Total Experience
3 yrs 3 mos
Average Tenure
2 yrs 11 mos
Current Experience

Blaize

Head of Product

Jul 2023Present · 2 yrs 11 mos

Technology LeadershipProduct Management

Imagination technologies

Vice President, Architecture & Modelling

Oct 2021Jun 2023 · 1 yr 8 mos · Kings Langley, England, United Kingdom

  • GPU, AI, Heterogeneous Compute
Technology LeadershipEngineering LeadershipProduct Management

Start-up

Founder

Apr 2019Sep 2021 · 2 yrs 5 mos · Hyderabad, India

  • Massively-Parallel Processor for cognitive computing and AI at the EDGE.
  • Conceived the architecture, Assembled the core team, drove successful early tech-validation with potential partners / customers (Integrators, OEMs, Chipmakers), Lead the fund-raising effort; explored VCs in USA, India, Singapore & EU. Built a decent Rolodex.
  • Acquired the tools / filters to cut-through the BS in the venture-capital and start-up ecosystems. Overall, a steep learning curve.
Technology LeadershipEngineering LeadershipProduct Management

Xilinx

Director of Engineering / Chief Architect, AI Stack for Xilinx MPSoC

Nov 2015Feb 2019 · 3 yrs 3 mos · Hyderabad Area, India

  • Technology / Engineering / Management.
  • Head (worldwide) organization for architecture, design and delivery of hardware accelerator IPs for computer-vision and deep-learning on Xilinx MPSoC FPGAs. (https://www.xilinx.com/products/design-tools/embedded-vision-zone.html)
  • Cross-functional leadership and coordination of engineering, tool dev, IT, product management and marketing teams spread across geographies (India, USA, Korea, Japan, EU)
  • Chief Architect, CHaiDNN (https://github.com/Xilinx/CHaiDNN), A State-of-the-art Deep-Learning (DNN) open-source Hardware Accelerator IP/Library for Xilinx FPGAs. A scalable and configurable architecture for various Xilinx Zynq/MPSoC family of FPGA devices. Best-in-class performance on several convolutional neural-network (CNN) topologies like Alexnet, VggNet, GoogleNet, SqueezeNet, Resnet(s), Single Shot Detector(SSD), Mobilenet-SSD, FCN etc. Patents Pending.
  • Deep pre-sales technology engagements with multiple automotive, video-surveillance and industrial vision tier-1 / OEMs
  • High-Level-Synthesis (HLS) and SDx based designs.
  • 3 Patents
Technology LeadershipEngineering LeadershipProduct Management

Blaize

Director of Hardware Engineering / Chief Scientist

Jan 2012Oct 2015 · 3 yrs 9 mos · Hyderabad Area, India

  • First member on the founding team
  • Engineering management, Technology leadership and Architecture
  • Built a strong and enthusiastic team of 40 hardware and software engineers.
  • Hardware architecture (from the ground up) and design: Architecture of 1/3 of the first-generation core. OpenCL architecture, OpenGL-ES3 compliant Sampling-Engine, ultra-low-power SIMD floating-point unit for the programmable execution-unit, Last-Level Cache, Fixed-Function geometry-processing blocks; Instance complexity: 5M+ Gates.
  • Software architecture (from the ground up) and design responsibilities: OpenCL2.0 UMD Driver for Linux, OpenGL ES UMD Driver for Linux, C++ Based Functional Simulator for Sampling Engine.
  • Process/Flows/Methodologies (from the ground up): Definition and hands-on development of performance-estimation/analysis methodology, Definition and hands-on development of power-calibration methodology using emulation platform, Definition of validation environment, Definition and hands-on development of verification environment for cadence emulation platform (palladium), Definition and hands-on development of pre-layout synthesis environment.
  • EDA and Design services vendor engagement.
  • Operations Management during the boot-up phase
  • 5 Patents
Technology LeadershipEngineering Leadership

Intel corporation

GPU Architect

May 2004Jan 2012 · 7 yrs 8 mos · Bangalore

  • One of the Earliest hires of the graphics IP engineering team at Intel India Development Centre. Played a key role in bringing up a team, which over the years has successfully designed and Productized several multi-$B revenue generating products completely out of India, to a team of over 250 (Front-End and Back-End engineers) by 2011.
  • Technology leadership and engineering management on 8 generations of Intel Integrated GPU Core.
  • Around 8 years on the Architecture and Design of highly parallel, multi-threaded, low-power SIMD processor core (Execution Unit)
  • Micro-Architect and Engineering lead on the fourth and fifth generation integrated-graphics cores of Intel: Broadwater (90nm), Bearlake (90nm), Cantiga (65nm), Eaglelake (65nm), Ironlake(45nm), Cantiga and Eaglelake were the first chipsets completely architected and executed out of India. Lead the technology effort on these projects. These were also the first $B+ revenue generating chipsets to be completely architected and executed out of India Development Center. Concept to PRQ.
  • Architect and Engineering lead on Seventh, Eighth and Nineth Generation integrated-graphics cores of Intel; Ivyridge(22nm), Haswell(22nm), Valleyview(22nm), Broadwell(14nm), Skylake(14nm).
  • Lead the pathfinding activity for a low-power programmable graphics core on Cantiga, Valleyview, Cherryview, Broadwell and Skylake. The significant power savings enabled design wins in the handheld space for the first time.
  • Cross-engineering team coordination; Coordination of Front-end Engineering, Structural Design engineering, Post-Si validation, Software Driver Development teams.
  • Performance architect on Larrabee (an x86 based Discrete Graphics chip). Microsoft reference-rasteriser based performance analysis of the Larrabee architecture.
  • High-Level Performance Modelling
  • Guided a PhD on “Low Power Graphics Processing Units” at Indian Institute of Tech (IIT), Delhi.
Technology LeadershipEngineering Leadership

Hcl technologies

Member of Technical Staff

Apr 2003Apr 2004 · 1 yr · Chennai

  • • Design, Validation and Timing Closure of the DMA block of Xambala Inc's semantic processor. The processor was built on NEC gate array technology.
Technology LeadershipEngineering Leadership

Rendezvous on chip india pvt ltd.

Sr Design Engineer

Dec 1999Apr 2003 · 3 yrs 4 mos · Hyderabad Area, India

  • Architecture, Micro-Architecture, Design Closure of several blocks of a TCP offload engine and accelerator.
  • Specification(RFC) to Arch to Design Closure of several RFCs.

Education

Manipal Institute of Technology

Master of Science (M.S.) — VLSI-CAD.

Jan 2010Present

Acharya Nagarjuna University

Bachelor of Technology (B.Tech.) — Electrical and Electronics Engineering

Jan 1995Jan 1999

Kendriya Vidyalaya

High School

Stackforce found 100+ more professionals with Technology Leadership & Product Management

Explore similar profiles based on matching skills and experience