Dhruv Sharma

AI Researcher

Delhi, India2 yrs 6 mos experience

Highly StableAI Enabled

Key Highlights

Proven track record in AI and machine learning applications.
Significant improvements in translation model accuracy.
Expertise in developing advanced NLP solutions.

Stackforce AI infers this person is a skilled AI/ML developer with a focus on NLP and computer vision.

Contact

Skills

Core Skills

Natural Language Processing (nlp)Machine LearningComputer Vision

Other Skills

Adobe PhotoshopAlgorithmsArtificial Intelligence (AI)BERT (Language Model)BashCanvaChatbot DevelopmentChatbotsConvolutional Neural Networks (CNN)Data AnalyticsData CachingDatabasesDebuggingDeep LearningElectronic Engineering

About

Hi, my name is Dhruv Sharma, I am a B.Tech graduate from Maharaja Agrasen Institute of Technology (MAIT), having completed my degree in Electronics and Communication Engineering with a minor specialization in Artificial Intelligence and Machine Learning (AIML) in June 2025, with a CGPA of 8.38/10. With a proven track record of applying AI and machine learning to real-world challenges such as developing Object Detection System and Finetuning Language Models. I aim to leverage my skills in Python, deep learning frameworks, and problem-solving to drive technological advancements with creativity and precision.

Experience

2 yrs 6 mos

Total Experience

2 yrs 2 mos

Average Tenure

Current Experience

National informatics centre, meity

Artificial Intelligence Intern

Feb 2025 – Jun 2025 · 4 mos · New Delhi, Delhi, India · On-site

As an Artificial Intelligence Intern at the National Informatics Center (NIC) I worked to reduce round-trip translation errors in the AI4Bharat's IndicTrans2 language model, a transformer-based system supporting 22 Indian languages that was trained using the Bharat Parallel Corpus Collection dataset.
A round-trip translation error occurs when a word from one language (e.g., word A in language 1) is translated into another language (e.g., word B in language 2), but translating word B back to the original language does not return word A, indicating a loss of accuracy or semantic consistency in the translation process.
My task was to enhance the model’s accuracy by addressing this issue. My approach involved transitioning to the Fairseq toolkit after initial explorations, adapting the process for a single GPU (Tesla T4 on Google Colab). I fine-tuned both the Indic-En (Indian Languages to English Language) and En-Indic (English Language to Indian Languages) models on English-Tamil data entries.
Incorporating Bash scripts for efficient data handling, my efforts lead up to significant improvements that are:
For the Indic-En model (fine-tuned on 8,500 Tamil-English data entries) achieved a BLEU score of 26.1 (up from 3.2), chrF2 of 56.6 (up from 32.5), and chrF2++ of 53.4 (up from 27.6).
For the En-Indic model (fine-tuned on 8,500 English-Tamil data entries) achieved a BLEU score of 22.1 (up from 3.0), chrF2 of 58.6 (up from 33.9), and chrF2++ of 52.9 (up from 28.7).
The final metrics demonstrate a substantial enhancement in translation accuracy, with the Indic-en model showing a significant BLEU increase of 22.9 points and the en-Indic model improving by 19.1 points. The chrF2 and chrF2++ scores also reflect better character-level and extended n-gram precision, underscoring the effectiveness of my tailored fine-tuning approach.

Python (Programming Language)BashModel TrainingData AnalyticsNatural Language Processing (NLP)Fine Tuning+2

Phrenics

3 roles

Organization Manager

Jun 2024 – May 2025 · 11 mos

Head of Photography

Aug 2023 – Jun 2024 · 10 mos

Canva

Member

Dec 2022 – May 2025 · 2 yrs 5 mos

CanvaAdobe Photoshop

Ministry of electronics and information technology

Summer Internship

Jun 2024 – Aug 2024 · 2 mos · On-site

During my Internship at the Digital India Bhashini Division under the Digital India Corporation, which operates under the Ministry of Electronics and Information Technology (MeitY). I focused on developing and implementing advanced natural language processing (NLP) techniques, particularly in building and optimizing AI-based solutions.
One of my key responsibilities had been studying and developing n-gram language models to enhance text prediction and generation capabilities.
Additionally, I designed and developed a chatbot using the Retrieval-Augmented Generation (RAG) technique, combining retrieval-based methods with generative models to produce accurate and contextually relevant responses. Initially, I employed RetrievalQA and Vectorstore for the retrieval component but later switched to Redis for caching to improve efficiency and performance.
I utilized various large language models (LLMs), such as 'BAAI/bge-base-en-v1.5' for tokenization and embedding, and 'microsoft/Phi-3-mini-4k-instruct' for text generation, to optimize the chatbot's functionality.
Furthermore, I implemented a code snippet to ensure that the generated responses were concise and relevant by removing unnecessary context or additional questions.
I also developed a user-friendly web interface using Streamlit, enabling seamless user interaction with the chatbot.
Throughout this internship, I have gained practical experience in developing and deploying AI-based solutions, deepened my understanding of NLP.

RedisN-Gram Language ModelsProblem SolvingNumPyChatbotsspaCy+12

Cdac,noida

Summer Internship

Jul 2023 – Sep 2023 · 2 mos · Hybrid

During my internship at the Center for Development of Advanced Computing (CDAC) in the summer of 2023, I developed a real-time Python program for Vehicle Detection and Distance Estimation. Using supervised learning techniques and TensorFlow, I trained a model on datasets of Indian traffic and roads, which I collected and managed, ensuring a diverse representation of urban and rural scenarios.
My work involved segmenting and labeling image components with tools like LabelImg, applying Image Processing techniques such as Edge Detection and Thresholding with OpenCV to preprocess data, and analyzing performance metrics to iteratively improve model accuracy.
I calculated and displayed confidence levels for detected vehicles, enhancing the program's reliability, and integrated geometric calculations using triangle similarity to estimate distances based on focal length and vehicle dimensions.
I focused on detecting and measuring the distance of six specific vehicle classes:
Bike
Car
Ambulance
Lorry
Bus
Auto(Tuktuk)
For each class, I optimized the model by adjusting hyperparameters and expanding the dataset to include challenging angles and occlusions, ensuring comprehensive coverage.
The results that I received were impressive, with the YOLOv5 model achieving a mean Average Precision (mAP) of 94.5%, precision of 90.9%, and recall of 86.6% across all classes, as validated on a dataset of 298 images containing 333 instances.
Class-wise performance varied, with cars showing the highest mAP50 of 99.3% and ambulances the lowest mAP50-95 of 44.7%, reflecting the dataset’s diversity. These outcomes underscored the model’s effectiveness in real-time applications, providing accurate detection and distance estimation.