Saurabh Yadav

Backend Engineer

Bengaluru, Karnataka, India2 yrs 6 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Expert in fine-tuning LLMs for improved accuracy.
Proficient in building low-latency backend services.
Strong background in prompt engineering and inference validation.

Stackforce AI infers this person is a Backend Engineer specializing in LLMs and SaaS solutions.

Contact

Skills

Core Skills

Large Language Models (llm)Model Fine Tuning

Other Skills

Amazon Web Services (AWS)Analytical SkillsAutomated EmailBackend Logic DevelopmentC++CRUD OperationsCascading Style Sheets (CSS)CommunicationDBeaverDSAData Structures and Algorithms (DSA)Database Management System (DBMS)Django REST FrameworkDockerEngineering

About

Backend engineer focused on production LLM inference systems. Experience fine-tuning and deploying instruction models (LoRA, Hugging Face) and building low-latency backend services using Go and Python. Skilled in prompt engineering, inference validation, and system integrations that reduce latency and improve model accuracy.

Experience

2 yrs 6 mos

Total Experience

1 yr 3 mos

Average Tenure

2 yrs 1 mo

Current Experience

Ibm

Backend Developer (Watsonx.ai and Watson Machine Learning (WML))

May 2024 – Present · 2 yrs 1 mo · Bengaluru, Karnataka, India · On-site

➔ Designed and developed the MCP Deployment Server, a core backend component enabling model deployment, management, and inference across IBM Cloud using the Model Context Protocol (MCP)
➔ Fine-tuned the IBM/granite-3.3B Instruct model by generating synthetic data from Jane’s API, improving model response quality on targeted tasks.
➔ Built a custom LoRA adapter (a lightweight tuning method) using gbcli to make the model learn faster and better for specific tasks.
➔ Used smart prompt engineering and data enhancements to help the model give better, more accurate responses.
➔ Continuously improved the model’s accuracy to 98% by refining the training data and adjusting fine-tuning process.
➔ Performed inference testing and validation, ensuring model robustness and alignment with business requirements
➔ Implemented Envoy Proxy to maintain continuous synchronization with Redis for updated metadata, enabling efficient routing of inference requests to the runtime manager for predictions.
➔ Implemented the GoDog FVT Framework, streamlining the deployment, testing, and prediction processes for Watsonx and WML models.
➔ Migrated keys from etcd to Redis, reducing record retrieval time by 100ms and enhancing storage performance by leveraging Redis’ support for multiple data types.

MCP ServerIBM WatsonModel Fine TuningPrompt EngineeringLarge Language Models (LLM)Go (Programming Language)+3

Flash.co

Backend Intern

Mar 2024 – Apr 2024 · 1 mo · Bengaluru, Karnataka, India · On-site

➔ Developed and integrated the complete backend logic for promo codes tailored for new users of the Flash platform.
➔ Designed and implemented robust error handling mechanisms to ensure seamless functionality of promo code usage.
➔ Engineered backend solutions for feature flag management, enhancing the flexibility and control of Campaigns and Rewards services.
➔ Collaborated closely with cross-functional teams to ensure smooth deployment and integration of new features.
➔ Conducted thorough testing and debugging to maintain high standards of code quality and reliability.

Backend Logic DevelopmentError HandlingFeature Flag Management

Effigo

Product Engineering Intern

Jan 2024 – Mar 2024 · 2 mos · Bengaluru, Karnataka, India · On-site

Developed a comprehensive learning portal application featuring three distinct user roles:
➔ Admin role empowered with CRUD operations for course management and editing functionalities.
➔ Author role enabling publishing of new courses to the platform.
➔ User role facilitating enrollment in available courses for seamless learning experience.
Implemented Vendor Summary logic and page within a Supply-Chain application, showcasing detailed order information alongside Initiator names and various levels of approval details.

CRUD OperationsUser Role Management

Deeptek

SDE Intern

Jul 2023 – Dec 2023 · 5 mos · Pune, Maharashtra, India · On-site

Implemented QR Scanner for Patient Details
➔ Developed a QR code scanning feature to efficiently capture patient information.
➔ Utilized the Zebra Crossing (ZXing) Library, a popular open-source library for QR
code processing, to enable this functionality.
➔ Successfully integrated the QR scanner into the application to streamline the
process of saving patient details.
Implemented Cron Job for Automated Email
➔ Set up a Cron job to automate the process of sending emails to Deeptek firms.
➔ Configured the Cron job to run at specified time intervals to ensure timely
communication.
➔ Attached an Excel sheet containing patient details to the automated emails,
ensuring that the recipient has all the necessary information.
Worked on Multipart/Related Data Generation
➔ Developed the ability to generate and handle multipart/related data.
➔ Implemented APIs to manage requests involving multipart/related data in the
application.
➔ Utilized the resteasy-multipart-provider dependency and the
resteasy-spring-boot-starter dependency from the RestEasy library to facilitate
the handling of multipart/related data.