S

Srikanth murthy

Associate Partner

Bengaluru, Karnataka, India18 yrs 3 mos experience
AI EnabledHighly Stable

Key Highlights

  • 16 years of experience in the Semiconductor Industry.
  • Expertise in Physical Design and Verification.
  • Proven track record in AI Architecture exploration.
Stackforce AI infers this person is a Semiconductor Design Expert with a focus on AI and Physical Design.

Contact

Skills

Core Skills

Ai ArchitecturePhysical DesignTiming ClosureLow Power Design

Other Skills

PPA OptimizationTCLPythonHierarchical TOPCongestion AnalysisClock Network DesignECOsCustom PG OptimizationDRC/LVS FixesSystem on a Chip (SoC)DesignLow power design timing closureRoutingFloorplanningLow power placement

About

16 years in Semiconductor Industry. currently into AI Architecture exploration. Vastly experienced in Physical design and verification and Layout Automation using TCL/Python. Worked on 5nm, 7nm, 14nm, 16nm, 20nm, 28nm and older node technology(TSMC/Samsung/Intel/UMC). SOCs/IPs such as AI cores with Peta/Exa FLOPS HPC-84 cores VR-AI Autochip CPU Zipline mobile phone Intel's first and second generation GFX processors (broadwell/Haswell) Intel's server product, Xeon(eagleton)- worlds first 10 core processor. Focussing on Hierarchical TOP Floorplan, Partitioning, Congestion Analysis targeting better aspect ratio for Sub-blocks and minimizing interface timing, leading to an efficient PPA. Expertise in timing and congestion critical designs, PV analysis. Currently in a hunt for a challenging role to put my strengths and skills for a test. Strongly believe in adversities offered by a Design/Project.

Experience

18 yrs 3 mos
Total Experience
5 yrs
Average Tenure
3 yrs
Current Experience

Samsung semiconductor

Associate Director

May 2023Present · 3 yrs · Bengaluru, Karnataka, India · On-site

  • AI Architecture exploration:
  • providing PD perspective on feasible AI architectures targeting throughput and PPA.
  • NPU/Systolic based computation implementation for AI/ML based training or inference.
  • pathfinding computation units to achieve better PPA, targeting lower power and improved throughput (Peta / Exa FLOPS).
AI ArchitecturePhysical DesignPPA OptimizationTCLPython

Meyvnsystems

Senior staff Engineer - Physical Design

Feb 2020May 2023 · 3 yrs 3 mos · Gyeonggi, South Korea

  • Working on Hierarchical TOP, Goal is to achieve better aspect ratio for sub-blocks in terms of Timing and Congestion and optimized interface timing (better PPA). Enabling several CTS techniques to build an efficient clock network for Sub-blocks eventually converging both TOP and Sub-blocks timing.
  • Running STA and providing timing ECO for Sub-blocks focussing on interface timing.
  • Project Details:
  • Cortex A53: TOP and quad cpu , ICC2. Samsung Process
  • ·        5/8nm process, this module is a hierarchical TOP(550k) consisting of quad cpu core (600k).
  • ·        Several topologies used to satisfy the chip requirement and to reduce area.
  • ·        Upgraded clock from 1.2ghz to 1.6ghz, limiting the usage of slvt to reduce the leakage poses challenge in closing timing.
  • ·        Proper L2 cache placement to control data module to converge timing.
  • Data Center Chip: Zipline Subsystem Module, ICC2. Samsung Process
  • ·        14nm process, this module is a hierarchical TOP(2.5 mil) consisting of 10 subblocks (10 mil).
  • ·        Enabling several CTS techniques to build an efficient clock network for Sub-blocks eventually converging both TOP and Sub-blocks timing.
  • ·        Developed a GUI based PV tool using Python to run PV for SUBBLOCKS/SUB-system/Full chip.
  • AI chip: VR Product, ICC2. Samsung Process
  • ·        5nm process, handling a block of 8mil with 4 sub blocks which acts as compute machine.
  • ·        With multiple power domain, complex floorplan with high density memories of 200.
  • ·        Resolving multiple clock gating issues, low power design issues and converging for both timing and power.
  • ·        Congestion issues were a big part with 200 memories in small sub block which adds SI isssue with narrow memory channel. Memory regrouping was challenging and necessary as MBIST control registers were dominating.
  • ·        Converging sub blocks for TOP integration as interface timing became critical and several feedthroughs had to be given special care.
Hierarchical TOPTiming ClosureCongestion AnalysisTCLClock Network DesignPhysical Design

Mediatek

Physical Design Engineer

Mar 2014Jan 2020 · 5 yrs 10 mos · Singapore

  • AutoChip : ICC and ICC2 (3 months turn around). TSMC Process
  • ·        28nLP and a Flat Chip with 4 power domains of 3 million instance
  • ·        Complex floor plan because of multiple power domains and custom PG build.
  • ·         Insertion of custom scan chain and taking care of special user defined Analog nets
  • ·        Multiple functional ECOs for new features added.
  • ·        Worked on another version of Autochip  (automobile-AI) Hierarchical chip, 16n,  handling a block of 2 million instance, timing critical with multiple floor plan iterations and controlling module placements for better timing.
  • Digital TV, Network Chip : ICC/ICC2 (20n, 28n, 40n, 55n with 2million instance design). TSMC Process
  • ·        Worked on several Blocks and Macros in different projects from Floorplan to PV and ECOs.
  • ·        Custom PG optimization for IR improvement. Improving CTS with AE and designer interactions for multiple clock design.
  • ·        Challenging routing for rectilinear shaped Macros and optimizing using keepouts and blockages.
  • ·        Checking feedthroughs and customizing the utility for better QOR.
  • ·         Power leakage fixes by VT swaps and using PECO in tweaker.
  • ·        Fixed DRCs using custom scripts/Automation to save turnaround time.
  • HeartRate Monitor: ICC (2months ). TSMC Process
  • ·        40n with 200k Design, Flatchip with flipchip and BGA package for the same design supporting multiple clients.
  • ·        Challenge was in IO placement with different package satisfying RDL routing.
  • ·        ESD fixes being crucial which would affect IO placement and routing and PG requirement.
  • ·        PV being challenging with LVS being tricky with 2 packages. Functional ECOs for added new feature making it tough but iterations helped optimizing the ECO cycle for next projects.
Physical DesignTiming ClosureECOsCustom PG Optimization

Intel

Physical Design Engineer

Jan 2008Feb 2014 · 6 yrs 1 mo

  • Broadwell Graphics:
  • ·        It’s 14nm, 13 metal, pioneered Floor plan databases, successfully achieved on LV execution of two Mock tape-ins and Delivered Most of the tape in quality blocks with all planned and surprise ECOs.
  • ·        Layouts of partitions for chip-lets and Full chip on timely basis, Targeted for DFM, low power, clean for DRC/LVS/Reliability/clock spine /clock shield and Noise fixes. ECO’s Support RC-extraction, IR drop and other post-layout analysis of layouts as per Design Engineers requirements, Ownership of block level layouts, leading junior layout designers as per requirements of the block layout.
  • HSWGTH graphics :
  • ·        21 nm, 10 metal layer.
  • ·        Played a key role in driving automation for most of the DRC fixes and worked with automation team to port most of the existing automation to the new technology node Targeted for DFM, low power, clean for DRC/LVS/Reliability and Noise fixes.
  • ·        ECOs and physical integration with Full chip antenna fixes and integration with the Microprocessor’s uncore is a tedious and a multiple iteration job, which demands a cautious eye for detail DRC /density to fore see the issues.
  • ·        Led a team of 20 contractors for the project, training them on process technology, tool training & various LV flows.
  • Eagleton 10 core server processor:
  • ·        32nm, 9metal layer technology, responsible for Placement and Routing optimization of the clock spines. Design is targeted to clean for DRC, LVS, IR drop and Noise fixes. Design is
  • ·        Implemented with the ECOs by the Circuit Designer, for optimization of timing, Noise. Most of the signals routed manually to avoid much ECO.
  • ·        Responsible for layout and convergence of 27 clock spines in multi core processor.
  • ·        Involved lot of matching, track planning for initial clock grid distribution, all the slots and standard cells were customized for better clock performance.
Physical DesignECOsDRC/LVS FixesTiming Closure

Education

Visvesvaraya Technological University

Bachelor of Engineering - BE — Electronics and Communications Engineering

Jan 2003Jan 2006

M.N.Technical Institute - India

Diploma in Engineeeing — Electronics and Communication

Jan 2000Jan 2003

Stackforce found 100+ more professionals with Ai Architecture & Physical Design

Explore similar profiles based on matching skills and experience