Abhay kumar — Co-Founder
I'm a Senior Research Engineer with 8+ years in data science and over 5 years focused on language modeling. I specialize in building and training Large Language Models (LLMs), with a deep interest in training stability, optimization, and scaling. Most recently, I co-led the pretraining of a 7B-parameter LLM on 15 trillion tokens. At BluOrion, I also developed ZClip—an adaptive gradient clipping algorithm that improves convergence stability and eliminates the need for manual batch skipping. Additionally, I’ve worked on initialization strategies and activation variance control techniques to stabilize and accelerate LLM training. My experience spans training models at various scales, distributed training frameworks (FSDP, DeepSpeed, PyTorch Lightning), and open-source contributions including miniLLaMA, GPT2-TensorFlow, and miniGPTF. I'm passionate about solving real-world challenges in large-scale model training and pushing the boundaries of efficient LLM development
Stackforce AI infers this person is a Senior Research Engineer specializing in AI/ML with a focus on Large Language Models.
Location: Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates
Experience: 12 yrs 5 mos
Skills
- Large Language Models (llm)
- Distributed Training
- Deep Learning
- Transformers
Career Highlights
- Co-led pretraining of a 7B-parameter LLM.
- Developed ZClip for improved LLM training stability.
- Passionate about efficient LLM development.
Work Experience
Technology Innovation Institute
Senior LLM Research Engineer (9 mos)
BluOrion Limited
Senior LLM Research Engineer (10 mos)
yellow.ai
Research Scientist (NLP) 3 (2 yrs 8 mos)
Research Scientist (NLP) 2 (1 yr)
EdGE Networks Pvt. Ltd.
Senior Data Scientist (NLP) (2 yrs 5 mos)
Data Scientist (NLP) (9 mos)
Scry Analytics
Data Scientist (1 yr 4 mos)
Gauge data solutions
Data Analyst (1 yr)
Anyaani Technology Pvt. Ltd.
Co-Founder (1 yr 8 mos)
Education
Bachelor of Technology (B.Tech.) at Rajiv Gandhi Prodyogiki Vishwavidyalaya