Prashant Vithule

Software Engineer

Tiroda, Maharashtra, India1 yr 9 mos experience

Key Highlights

  • Expert in AI systems and LLM optimization.
  • Proven track record in high-performance computing.
  • Skilled in advanced vectorization techniques.
Stackforce AI infers this person is a specialist in AI and high-performance computing.

Contact

Skills

Core Skills

Large Language Models (llm)NumaSve

Other Skills

Advanced Vector Extensions (AVX)SIMDOpenMPThread affinity

About

Passionate about AI systems, LLM & LMM inference optimization, and high-performance computing. Worked on quantized kernel optimization, speculative decoding, threading affinity, NUMA-aware design, and SIMD/SVE acceleration for large language models. Experienced in C, C++, CUDA, OpenMP, LLVM, NUMA, and GPU programming.

Experience

1 yr 9 mos
Total Experience
1 yr 8 mos
Average Tenure
1 mo
Current Experience

Amd

Sr. Software System Designer

Apr 2026Present · 1 mo · Bengaluru · Hybrid

SVEAdvanced Vector Extensions (AVX)SIMDOpenMPLarge Language Models (LLM)Thread affinity+1

Fujitsu research

Software Engineer

Jul 2024Mar 2026 · 1 yr 8 mos · Bengaluru · Hybrid

  • 1. Implemented SVE-based vectorized dot product kernels for quantized formats (Q2_K, Q3_K, Q4_0, Q8_0) in llama.cpp, achieving up to 3× throughput improvement on 512-bit architectures.
  • 2. Implemented a lookahead speculative decoding framework in llama.cpp, enabling faster inference.
  • 3. Designed NUMA-aware execution and optimized thread affinity to improve CPU utilization and reduce cross-node memory latency.
SVEAdvanced Vector Extensions (AVX)Large Language Models (LLM)

Education

Indian Institute of Science (IISc)

Master of Technology - MTech — Computer Science

Jul 2022May 2024

Government College of Engineering, Amravati.

B.tech — CSE

Jan 2018Jan 2022

Stackforce found 100+ more professionals with Large Language Models (llm) & Numa

Explore similar profiles based on matching skills and experience