Firoz K.

AI Researcher

Gurugram, Haryana, India4 yrs experience

Most Likely To SwitchHighly Stable

Key Highlights

Established AI as a core competency for the company
Built deployment infrastructure for diffusion models from scratch
Created LLM pipelines for high-stakes financial applications

Stackforce AI infers this person is a Fintech and AI specialist with expertise in advanced model deployment and optimisation.

Contact

Skills

Core Skills

Large Language Models (llm)Natural Language Processing (nlp)Large Language Model Operations (llmops)Information RetrievalDiffusion ModelsArtificial Intelligence (ai)Software DevelopmentComputer Vision

Other Skills

AWSAmazon S3Amazon Web Services (AWS)Audio EditingAutonomous AgentsCascading Style Sheets (CSS)DCGANDSPyData ScienceData ScrapingDatabasesDeep LearningDiffusers LibraryDocument RetrievalDreambooth

About

I led the development and deployment of AI capabilities across the organisation for 3+ years, overseeing projects ranging from diffusion models for image generation to LLM systems for personalisation and financial analysis. My biggest impact was establishing AI as a core competency for the company. When diffusion models first emerged, I built the entire deployment infrastructure from scratch. For our financial products, I created deterministic LLM pipelines that maintained accuracy in high-stakes applications. I was among the first to deploy MCP tools integrated with DSPy in production, creating an interaction wrapper that solved the scalability challenge before it became a standard pattern in the ecosystem. Beyond the technical work, I led company-wide knowledge sharing and empowered other teams, including frontend developers, to independently build AI features. I drove all quality improvements, experimented with emerging technologies, and ensured our products stayed at the frontier of what was possible. Essentially, I was responsible for both building the AI systems and building the organisational capability around AI.

Experience

4 yrs

Total Experience

4 yrs

Average Tenure

4 yrs

Current Experience

Totality corp

5 roles

Senior AI Engineer

Promoted

Jan 2025 – Present · 1 yr 4 mos

1. Designed and deployed a self-improving multi-agent dialogue system (DSPy, gRPC, internet search) with:
🤖 Real-time human–AI group conversations among emotional, professional, and domain-specific agents.
🧠 Continuous learning through intent and sentiment analysis to dynamically update agent personas.
🌐 Internet and vector-store integration for up-to-date, personalised, and context-aware responses.
⚙️ A global–local prompt optimisation pipeline that adapts each user’s persona while improving overall system intelligence.
2. Fine-tuned LLMs and encoder-only models (BERT, ModernBERT) for classification tasks, achieving up to 92% F1 accuracy on benchmark datasets.
3. Built automated prompt optimisation pipelines, improving model performance by 10–12% and reducing inference costs by 85%.
4. Optimised and deployed LLMs for Android (GGUF) and iOS (q-safetensors), cutting memory footprint by 50% while maintaining comparable accuracy.
5. Leveraged Unsloth + TRL with GRPO-style training (Deepseek) on Gemma 3, achieving 18% accuracy gain and edge-compatible fine-tuned variants.

Large Language Models (LLM)Large Language Model Operations (LLMOps)Natural Language Processing (NLP)Deep LearningModel Fine-TuningSentiment Analysis+9

AI Engineer

Oct 2023 – Dec 2024 · 1 yr 2 mos

1. Built a real-time market insights agent (DSPy, LLMs, web scraping) that processed 100% of NSE announcements within 2–5 mins, improving precision to 98% vs. 1–9 hours on traditional systems.
2. Implemented STORM, the world’s first deep-research financial RAG system, reducing report creation time by 90% and increasing accuracy by 70%.
3. Developed ColPali/ColQwen Image-RAG pipelines, enabling PDF-based financial Q&A without OCR — achieving 90% higher accuracy in edge cases.
4. Engineered Neo4j Knowledge Graphs (100k+ nodes) integrated with DSPy for cross-domain data connectivity.
5. Created Text2Cypher NL Query System (Neo4j, DSPy, Meilisearch), achieving 95% query accuracy in production.
6. Designed hybrid retrieval pipelines (Cohere, HyDE, BM25, DSPy) across FAISS, Pinecone, and Weaviate for state-of-the-art retrieval and ranking.
7. Integrated MCP + ReAct for advanced agentic reasoning in finance-focused AI agents.
8. Built autonomous web agents for verified information aggregation with authenticity scoring.
9. Developed an AI Podcast System using Whisper, ElevenLabs, and Pinecone for retrieval-based dynamic content generation.

Audio EditingLarge Language Models (LLM)Large Language Model Operations (LLMOps)Information RetrievalDocument RetrievalVector Stores+6

Generative AI Engineer

Promoted

Jan 2023 – Oct 2023 · 9 mos

1. Achieved 95% personalised image accuracy with 70% less training data using Dreambooth and custom face cropping pipelines.
2. Deployed multi-platform diffusion pipelines (AWS, GCP, Lambda Labs, Replicate), ensuring 99.9% uptime.
3. Integrated IP Adapter for efficient similarity-based inference, cutting processing time by 85% and costs by 70%.
4. Implemented inSwapper128.onnx-based face swapping with 98% realism success rate.
5. Improved image inference pipeline efficiency by 60% using HuggingFace + Kubernetes architecture.

Image ProcessingArtificial Intelligence (AI)Deep LearningDiffusion ModelsKubernetesDiffusers Library

Software Developer

Mar 2022 – Dec 2022 · 9 mos

1. Designed KYC and authentication services (gRPC, OAuth, JWT, S3), handling discrepancies between multiple identity sources such as Aadhaar and PAN, and later integrated DigiLocker for enhanced security and reliability.
2. Built real-time analytics pipelines (Kinesis, Lambda, Athena, Glue) for large-scale game data.
3. Developed a secure 2FA system, integrated gRPC and REST services, and implemented a referral system to enhance user engagement and security.
4. Developed analytics dashboards with QuickSight to visualise user behaviour, improving analysis speed by 60%.
5. Built backend API documentation to improve developer onboarding and maintainability.

gRPCObject-Relational Mapping (ORM)Go (Programming Language)REST APIsMySQLSQL+3

Computer Vision Intern (R&D)

Oct 2021 – Mar 2022 · 5 mos

1. Built and fine-tuned DCGAN models to enhance profile picture (PFP) generation quality.
2. Conducted early LLM research (GPT-3, GPT-J, GPT-Neo) for chatbots and dynamic NFT generation use cases.
3. Trained ML Agents in Unity using reinforcement learning and explored potential applications as enemies in third-person shooter games.
4. Experimented with NeRF technology to generate 3D voxel models from image datasets.

Image ProcessingPython (Programming Language)Machine LearningOpenCVComputer Vision