Hemanth Rudra — Software Engineer
Compute AI Engineer specializing in on-device LLM enablement, inference optimization, and agentic AI systems on Qualcomm Snapdragon platforms. My work focuses on bringing large language models and multi-modal models to edge devices, leveraging Hexagon NPU (QNN/HTP) for high-performance inference. I build end-to-end inference pipelines, optimize execution across runtimes, and design agentic workflows using local LLMs. Key areas I work on: Enabling LLMs on Qualcomm NPU. Building efficient inference pipelines (QNN, ONNX Runtime, local runtimes). Designing agent-based systems with on-device LLMs. Deep performance profiling & benchmarking (latency, TPS , memory). I’m particularly interested in edge AI, agentic workflows, and performance optimization, pushing LLM capabilities closer to real-world deployment on devices.
Stackforce AI infers this person is a specialized AI Engineer focused on edge AI and performance optimization.
Location: Hyderabad, Telangana, India
Experience: 4 yrs 11 mos
Skills
- Large Language Models (llm)
- Inference Optimization
- Profiling
- Benchmarking
Career Highlights
- Expert in LLM enablement on edge devices.
- Proficient in performance optimization and benchmarking.
- Strong background in AI systems on Qualcomm platforms.
Work Experience
Qualcomm
Senior Engineer (1 yr 7 mos)
Engineer (2 yrs 7 mos)
Nokia
Graduate Engineering Trainee (9 mos)
Research And Development Intern (9 mos)
Education
Mtech Integrated at Vellore Institute of Technology