Tapas Das

Machine Learning Engineer

Bengaluru, Karnataka, India4 yrs experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in developing scalable AI solutions.
Reduced inference latency by 30% in production systems.
Proficient in advanced machine learning techniques.

Stackforce AI infers this person is a Backend-heavy AI/ML Engineer specializing in scalable solutions for enterprise applications.

Contact

Skills

Core Skills

Machine LearningRetrieval-augmented Generation (rag)

Other Skills

Azure OpenAILangChainPostgreSQLDeep LearningNatural Language Processing (NLP)Random Forest RegressorXGBoostEngineering Data ManagementAnalytical SkillsSQLiteAmazon EC2Pre-production PlanningIncrease ProductivityDocker ProductsComputer Modeling

About

As an AI/ML Engineer at Intel, I design and deploy end-to-end GenAI solutions using Retrieval-Augmented Generation (RAG), LangChain, and open-source LLMs. My work involves developing production-grade systems that combine robust backend engineering with cutting-edge machine learning — from model optimization to real-time inference. I bring deep expertise in Python, PyTorch, TensorFlow, and FastAPI, along with hands-on experience integrating LLMs into real-world applications. At Intel, I led the development of a FastAPI-based embedding API using top open-source models and vector databases — reducing inference latency by 30% and enabling GPU/CPU switching on the fly. What drives me is solving real-world problems through scalable AI. Whether it's contributing to open-source or crafting internal solutions that save engineering time, I thrive at the intersection of system design and intelligent automation. I'm always exploring new AI frontiers and looking to collaborate with others building the future.

Experience

4 yrs

Total Experience

3 yrs 11 mos

Average Tenure

4 yrs

Current Experience

Intel corporation

Machine Learning Engineer

Jul 2022 – Present · 3 yrs 11 mos · India · Hybrid

Developed and deployed a production-grade Retrieval-Augmented Generation (RAG) system using Azure OpenAI, LangChain, and PostgreSQL, increasing support assistant efficiency by 30%.
Fine-tuned Hugging Face's BGE Reranker model on proprietary semiconductor data to resolve domain-specific acronyms and boost answer precision without relying on system prompts.
Engineered a supervised ML pipeline to analyze workload characteristics and predict optimal CPU allocation, reducing average processing time by 30%.
Built ML-based CPU modeling pipelines using Random Forest Regressor (RFR) and XGBoost, achieving a 7% Mean Squared Error (MSE) in design-time performance predictions.
Designed and implemented custom document loaders and a markdown-aware chunking strategy, significantly improving LLM response quality in structured technical documentation.
Integrated a hierarchical multi-agent RAG architecture combining SQL-based agents and vector-based agents under a supervisor agent, tailored for semiconductor design queries.
Implemented advanced GenAI tooling including LangChain Indexing, RAGAS framework, and LOTR (Lord of the Retrievers) for retrieval benchmarking and pipeline optimization.
Applied deep learning techniques across ANN, CNN, RNN, LSTM, and Transformer models for various modeling and NLP-related tasks.
Collaborated cross-functionally with platform, infra, and data science teams to ensure ML pipelines met SLA requirements and were production-ready, secure, and scalable.

Retrieval-Augmented Generation (RAG)Azure OpenAILangChainPostgreSQLMachine LearningDeep Learning+1

Intel technology india pvt. ltd.

Artificial Intelligence Engineer

Jun 2022 – Present · 4 yrs

Developed a scalable RAG pipeline using Azure Open AI, Lang Chain, and PostgreSQL that improved internal chatbot accuracy by 20%, accelerating engineering teams with faster data access. Built a modular SQL agent to translate natural language queries into structured SQL over CSV, JSON, and XLSX files by generating temporary in-memory databases, enabling analytics without manual data prep and increasing accessibility for non-technical users. Designed a hybrid AI assistant combining semantic retrieval with SQL-based querying via a two-stage query decomposition chain. Integrated a Supervisor Agent for dynamic intent routing, reducing query resolution time and boosting precision in chip design workflows. Implemented a real-time ML pipeline using XGBoost to predict optimal CPU configurations, reducing inference latency by 20% and achieving 7% MSE to streamline compute planning in early-stage hardware design.

Azure Open AILang ChainPostgreSQLXGBoostNatural Language Processing (NLP)Machine Learning+1