Priyanshu Tuli

AI Researcher

Bengaluru, Karnataka, India5 yrs 7 mos experience

Most Likely To SwitchAI Enabled

Key Highlights

Expert in machine learning and predictive analytics.
Proven track record in optimizing supply chain solutions.
Strong leadership in developing innovative data-driven projects.

Stackforce AI infers this person is a Data Scientist specializing in Machine Learning and Supply Chain Analytics.

Contact

priyanshu1tuli@gmail.com LinkedIn

Skills

Core Skills

Machine LearningData Science

Other Skills

API DevelopmentAirflowAlgorithmsAmazon ECSAmazon EKSAmazon Web Services (AWS)Analytical SkillsApacheApache KafkaArduinoArtificial Intelligence (AI)Artificial Neural NetworksAsyncAsynchronous MicroserviceBig Data Analytics

About

As a Senior Data Scientist at FourKites, I apply advanced machine learning techniques to tackle challenges in supply chain visibility, real time shipments tracking and predictive analytics. My journey into AI began with Andrew Ng’s Stanford course, which fueled my passion for the field. I’ve since deepened my expertise through various real-world projects and by solving complex problems. Previously at Sigmoid, I developed a versatile LLM testing framework, built pipelines for text extraction and named entity recognition, optimized product placement workflows for CPG manufacturers, and led a team to create a Human Activity Recognition solution. These projects reduced testing and analysis times while improving product visibility and model accuracy. Before that, I worked at BlueOptima, where I applied Graph Neural Networks to source code analysis and developed hybrid models to measure developer efficiency. I hold a BE in Computer Engineering from Thapar Institute of Engineering & Technology, where I earned a scholarship for academic excellence. I’m passionate about continuous learning and collaborating to create impactful, data-driven solutions at FourKites.

Experience

5 yrs 7 mos

Total Experience

1 yr 4 mos

Average Tenure

1 yr 11 mos

Current Experience

Fourkites, inc.

Senior Data Scientist

Jul 2024 – Present · 1 yr 11 mos · Chennai, Tamil Nadu, India · Hybrid

Developed and productionized an LSTM-based architecture to predict single and multi-stop truck ETAs, improving accuracy by 10–15% through condensed historical check-call representations and focal loss weighting of ETA buckets. Designed backward-compatible integration with existing classical ML models and implemented inter-service communication with the location service using concurrent API calls and semaphores. Optimized latency by caching historical check-calls and minimizing API requests. Enhanced late delivery notification precision by 5–10% by integrating appointment time bucket weightage into the loss function using soft labels and KL Divergence loss.
Built and deployed 50+ customer-specific models using time-bucketized classification with
tailored class weights and Kalman filtering, achieving 10–30% accuracy uplift across key delivery metrics versus traditional regression approaches. Introduced an isotonic calibration method to determine the optimal late delivery notification probability threshold based on journey progress.
Optimized ETA prediction models by integrating real-time target-encoded features and dynamic lagged features, through lookup tables and Redis based caching in the inference pipeline for improving the ETA accuracy of long tailed, right skewed distributions
Developed a predictive model to estimate vehicle dwell time at facilities during loading/unloading for Truck Loads with multiple delivery stops. Achieved a 40% and 35% improvement in accuracy for 30-minute and 60-minute predictions, respectively, based on online metrics. Successfully productionized the solution on an EKS cluster, enhancing downstream TL Multi-Stop ETA accuracy and providing more robust Late Reason Codes. Led this project independently from development to deployment.

Github ActionsProduct AnalyticsApache KafkaDecision TreesAmazon Web Services (AWS)Python (Programming Language)+32

Sigmoid

Senior Data Scientist

Oct 2023 – Jul 2024 · 9 mos · Bengaluru, Karnataka, India · Hybrid

Developed a versatile LLM testing framework capable of evaluating custom RAG pipelines with a variety of LLMs and embedding models, incorporating multiple LLM judges and benchmarking metrics. The framework also generates synthetic question-answer pairs to provide ground truth data for pipeline testing. Notably, the framework accommodates GPT, LLama, and Mistral LLMs, resulting in a 30% reduction in testing time.
Developed an automated pipeline for text extraction and named entity recognition from financial documents, reducing analysis time and manual effort by 15% in a short period.
Devised workflow and pipeline to ensure adherence to product placement standards in retail stores, as well as assessing essential metrics for optimizing shelf arrangement and layout for consumer packaged goods (CPG) manufacturers. This involved leveraging object detection techniques, such as fine-tuning the YOLO V8 model for detecting objects on shelves and shelf pricetags, implementing brand detection via Google Gemini Vision Pro LLM on cropped images, and utilizing OpenAI CLIP model to obtain image captioning probabilities. This resulted in a 25% increase in product visibility and a 10% rise in sales.
Solved the case study for a client involving delivery date prediction of products with the data containing privileged and masked values that can not be used at the inference time. The results were 90% on time which was a significant increase compared to their existing system of delivery estimation.
Led a team of 2 data scientists to build an end-to-end solution for Human Activity Recognition through sensor data signals and quantifying the different body movements in each activity for a client. We developed a custom hybrid approach of DNN, LSTM and CNN models and handcrafted the features via signal isolation techniques to achieve results having best accuracy of 95% on the test dataset.

Transfer LearningConsumer Packaged Goods (CPG)Large Language Models (LLM)Decision TreesMistralPipelines+20

Blueoptima

3 roles

Associate Machine Learning Data Engineer

Promoted

Oct 2022 – Oct 2023 · 1 yr

Built heterogenous and homogenous graph neural net models by utilizing different graph layers like GatConv, HeatConv, SAGEConv, GCNConv, TransformerConv and RGatConv.
Worked on building an end-to-end solution to identify vulnerabilities by colouring nodes using node classification. This handled different ways to mitigate vulnerabilities like using object and argument sanitisers; using escape characters for input validation; identifying sources from third party modules etc.
Built Graph NNs to classify vulnerable source code files by utilizing hybrid graphs (AST, DFG, CFG) and trails built from source code which are language agnostic. Finetuning Word2Vec, GraphCode-Bert embeddings and the in-house large graph models for node features and vectorization along with data augmentation on graphs. The models have achieved an F1 score of 0.8 with natural data points.
Worked on intense feature engineering in building trails for GNN model training which stitch the relevant bits of code together from across a particular repo to provide relevant context for vulnerable or fixed pieces of code i.e. connecting both sink and source.
Worked on generating graph highlights for vulnerable pieces of code using Captum and Perturbations for model explainability and inference.
Improved the training time of GNNs using GPUs and brought it down by 3x.
Released the Beta Version of the Pure Coding Time Estimation Project (PCTE) comprising 26K developers and achieving a correlation > 0.4 with the ground truth data i.e actual coding effort. This project models the coding behavior of a developer and gives a generic pattern of the coding time specific to the developer.
Worked on flagging non-developer activity using PCTE project results. This is used to remove coding effort awarded to non-developer activity code snippets.
Built end-to-end data and modelling pipelines and used K8s to utilize the cluster and schedule runs.

LinuxComputer ScienceKubernetesKerasAlgorithmsPython (Programming Language)+21

Graduate Machine Learning Data Engineer

Jul 2021 – Sep 2022 · 1 yr 2 mos

Worked on calculating the Pure Coding Time of developers using Neural Hidden Markov Models which involves a hybrid architecture of Neural Networks + Hidden Markov Models.
Worked on calculating the PMUE (Peak Model Unit Equivalence) used to get the emission probabilities for the HMM input. This helped correct the methodology for calculating the multiplier for the Coding Effort (CE) calculation.
Worked on building interactive dashboards using Grafana for reporting the results of the PCTE project to the stake holders
Worked on building an Asynchronous Micro-Service Architecture for Python Microservices using Celery, RabbitMQ and Flask tech stack.
Automated the Custom Wakatime Flask Microservice and the IDE Plugin for ground truth data collection.
Implemented Custom Task Queues using Redis backend to achieve fine parallelism on the K8s cluster and to reduce the manual effort of scheduling runs. This reduction in manual effort resulted in saving 40% of time spent.
Created interactive Jupyter Notebook reports for ML solutions, providing stakeholders with user-friendly interfaces to explore data, model performance, and actionable insights.

FlaskData AnalyticsLinuxComputer ScienceObject-Oriented Programming (OOP)Postg+24

ML Data Engineer Intern

Jan 2021 – Jun 2021 · 5 mos

Built an in-house ETA estimation modelling pipeline to calculate the completion time of historical extractions of repositories for Pilots and Clients. Productioinised the pipeline as well on a dedicated EC2 instance.
Worked with time-series forecasting using the FB Prophet library and multiple regression techniques to build an end-to-end data and modelling pipeline.
Used GridSearchCV and Optuna for hyperparameter sweep and finding the best hyperparameter values for SARIMAX models and Regression models.

FlaskData AnalyticsComputer SciencePipelinesJupyterAnalytical Skills+11

Indian institute of information technology

Winter Research Intern

Dec 2019 – Jan 2020 · 1 mo · Prayagraj, Uttar Pradesh, India

Worked on implementing Fuzzy Logic type 1 and type 2 sets using KM Algorithm approach, in Python, under the mentorship of Prof. U. S. Tiwary

Qin1

Online Instructor

Jul 2019 – Nov 2019 · 4 mos

Taught the basis of Python Programming to students in the age group of 10 to 15 years along with small utility apps using MIT APP Inventor tool

Reliance industries limited

Summer Intern

Jun 2019 – Jul 2019 · 1 mo · Navi Mumbai, Maharashtra, India

Built a closed domain question answering tool trained on SQUAD dataset using BERT model fine-tuned for financial documents

Unsaidtalks

Member of UIP

Jan 2019 – May 2019 · 4 mos · Patiala, Punjab, India

Interviewed seniors in my college regarding their internship and campus placement experience and their preparation startegy. These interviews were thereafter published on the website https://www.unsaidtalks.com/.