V

Vinayak Koli

Software Engineer

Delhi, Delhi, India5 mos experience

Key Highlights

  • Developed advanced LLM benchmarking pipelines.
  • Built production-grade applications across multiple tech stacks.
  • Passionate about merging AI research with practical engineering.
Stackforce AI infers this person is a Full-Stack Developer with a strong focus on AI and LLM technologies.

Contact

Skills

Core Skills

PythonLlm EvaluationRest ApisLaravel

Other Skills

OllamaHumanEvalMatplotlibAgentic AIPHPMySQLRESTful APIsEloquent ORMRecommendation SystemsAlgorithm DesignTypeScriptHuggingFace TransformersLangChainFastAPIReact Native

About

I build AI systems and full-stack applications that solve real problems, from hybrid movie recommendation engines trained on 1M+ ratings to desktop AI assistants that run entirely offline. Currently, I'm an undergraduate researcher at IIIT Delhi under Prof. Suman Roy, benchmarking Small Language Models on code generation tasks (165+ problems across HumanEval and DeepMind Code Contests), analyzing multi-step reasoning failures, and contributing to agentic AI workflows. On the engineering side, I've shipped production-grade systems across the stack: FastAPI + React Native mobile apps, Laravel REST APIs with role-based auth, Electron desktop apps with OS-level hooks, and ML pipelines using PyTorch, LightGBM, and LangChain. What drives me: the intersection of research and engineering, building things that are both theoretically grounded and practically useful. 🔧 Stack: Python · PyTorch · LangChain · React Native · FastAPI · TypeScript · Laravel · HuggingFace 📍 Delhi | IIIT Delhi '27 | Open to SWE & AI/ML internships (Summer/Fall 2026) 📬 vinayak23597@iiitd.ac.in | https://github.com/vinayakkoli2005 | https://vinaytasensei.vercel.app/

Experience

5 mos
Total Experience
--
Average Tenure
--
Current Experience

Indraprastha institute of information technology, delhi

Undergraduate Student Researcher

Jan 2026Present · 5 mos

  • Designed and implemented an ontology-guided LLM code generation pipeline with plan evaluation, iterative repair, and adaptive knowledge graph updates, benchmarked against OpenAI HumanEval (164 problems).
  • Built a 7-step pipeline from scratch: ontology retrieval via a custom 9-category knowledge graph (sorting, recursion, arithmetic, etc.), structured plan generation, self-scoring gate (revision triggered if plan score < 0.6), ontology-hint-injected plan revision, sandboxed code execution, error-feedback repair loop (up to 3 iterations with stagnation detection), and graph edge-weight updates on task success.
  • Benchmarked 10 local LLMs via Ollama (CodeLlama 7B/13B, Qwen2.5 7B–32B, Qwen2.5-Coder 7B–32B) across 3 prompting strategies, Direct, SCoT, and Ontology-Guided — achieving 76.22% pass@1 with the full pipeline vs. 64.63% baseline (+11.6pp).
  • Key findings: plan–success correlation r=0.168 (models systematically overestimate plan quality); counter-intuitive accuracy collapse in Qwen2.5-Coder at 14B/32B scale (suspected alignment regression).
  • Infrastructure: CLI pipeline with full reproducibility (seed, manifest snapshots), per-task JSONL logging, automated markdown/CSV reporting, and ontology graph visualization via NetworkX + Matplotlib.
  • Stack: Python · Ollama · HumanEval · Matplotlib · LLM Evaluation · Agentic AI
PythonOllamaHumanEvalMatplotlibLLM EvaluationAgentic AI

Istop

Backend Engineering Intern

Nov 2024Jan 2025 · 2 mos

  • Backend Engineering Intern — 1Stop (World Risk Governance)
  • Built two production REST APIs in Laravel 11: a Task Manager with full CRUD + completion workflows, and a Booking Management System with role-based dashboards (Admin/User), Laravel Breeze auth, and CMS-driven pages. Designed and migrated relational MySQL schemas; implemented server-side deduplication and Eloquent ORM model relationships. Delivered both systems in a 2-month remote engagement. Tech: Laravel 11 · PHP · MySQL · RESTful APIs · Eloquent ORM · Blade Templates
LaravelPHPMySQLRESTful APIsEloquent ORMREST APIs

Education

Indraprastha Institute of Information Technology, Delhi

Bachelor of Technology - BTech — Computer science and social sciences

Jan 2023Jul 2027

Kulachi Hansraj Model School

Jun 2008Jun 2023

Indraprastha Institute of Information Technology, Delhi

Computer Science and Social Sciences ( CSSS )

Kulachi Hansraj Model School

Xth & XIIth

Stackforce found 100+ more professionals with Python & Llm Evaluation

Explore similar profiles based on matching skills and experience