Nimish Bajaj

Software Engineer

Seattle, Washington, United States7 yrs 10 mos experience

Most Likely To SwitchAI Enabled

Key Highlights

Expert in building scalable data and ML solutions.
Proven track record in automating data pipelines.
Strong experience with cloud platforms and distributed computing.

Stackforce AI infers this person is a SaaS expert with a strong focus on data engineering and machine learning.

Contact

Skills

Core Skills

Machine LearningData Engineering

Other Skills

AWS EMRAWS LambdaAWS S3AWS SageMakerAlgorithmsAmazon Elastic MapReduce (EMR)Amazon KinesisAmazon S3Amazon Web Services (AWS)Apache KafkaApache SparkArtificial Intelligence (AI)CassandraClouderaCommunication skills

About

I love building robust and performant solutions around data, I am a big fan of distributed computing and love writing code that can scale well. My primary experience has been in automating and architecting data and ML pipelines, this has helped me learn about cloud platforms, databases, spark, APIs, Java, and Python.

Experience

7 yrs 10 mos

Total Experience

1 yr 6 mos

Average Tenure

3 yrs

Current Experience

Amazon

Software Development Engineer II

May 2023 – Present · 3 yrs · Bellevue, Washington, United States · On-site

University of florida

2 roles

Graduate Teaching Assistant

Jan 2023 – May 2023 · 4 mos

TA for CIS6261 - Trustworthy Machine Learning

Graduate Research Assistant

Jan 2022 – Dec 2022 · 11 mos

Conducting research on methods for extracting and visualizing semantic differences in textual inputs.

Apple

Software Engineer Intern

May 2022 – Aug 2022 · 3 mos · Cupertino, California, United States

Ltimindtree

Senior Software Engineer

Oct 2020 – Jul 2021 · 9 mos · Bangalore Urban

Created the Spark Framework for Machine Learning Automation (AutoML). It's built to be scalable and efficient for a variety of tasks (binary/multi-class classification and regression) on tabular datasets with a variety of characteristics, including numeric, categorical, dates, texts, and so on.
Within three months, the framework was developed and integrated into the frontend of L&T's LymByc product, and it was used by five major clients for key driver analysis, data exploration, and insights development.
Designed and implemented a Model Management System for storing and retrieving PySpark models through S3.
Leading the project to develop the Auto-Tune framework, which tunes Spark tasks on clusters automatically. For tuning Spark workloads, the approach employs a Heuristics-based approach (Rule-based approach) and an Optimization-based strategy (Machine Learning).
Without requiring any human intervention, AutoTuning saves 30% of cluster resources and significantly improves the Spark Job success rate. It is used extensively in L&T's Lymbyc product.
Extensively worked with Spark, AWS EMR, and AWS S3

SparkAWS EMRAWS S3Machine LearningData Engineering

Quaero

Machine Learning Engineer

Jan 2020 – Oct 2020 · 9 mos · Bengaluru Area, India

Now acquired by CSG
Designed and built a complete package for handling end-to-end machine learning tasks, including data preprocessing, advanced feature development, cross validation, and hyperparameter tuning for various models. Also allows the user to generate model training and profiling reports in order to assess model outcomes and uncover insights not apparent from the initial dataset.
Developed scalable and modular microservices and optimized APIs utilizing multi threading in Python, reducing response time to less than 1 second
Developed mechanisms to launch, monitor, and terminate stateless Spark Clusters thereby saving 30\% in VM cost
Built ETL workflows on Spark achieving a 5X improvement from traditional Python workflow performance

Mu sigma inc.

Decision Scientist - Mu Sigma Innovation Lab

Sep 2017 – Oct 2019 · 2 yrs 1 mo · Bengaluru, Karnataka, India

As a decision scientist, I built and optimized data analytics infrastructures to help multiple global enterprise clients turn raw data into actionable insights. Creating and executing data pipelines and streamlining data operations.
I built a big data pipeline for a telecom client to generate key insights from users' web interactions data. For this project, I conducted extensive research into the Lambda architecture for processing both real-time and batch data. I built a real-time processing pipeline using AWS Kinesis and Spark Streaming to process the data at a rate of over 1 million records per second. Reduced query times by pre-computing batch and real-time views of the data, resulting in a significant decrease in query runtime from over 5 seconds to 300 milliseconds on average.
Created python notebooks and packages to solve NLP problems including intent classification, entity extraction, and topic modeling to be used across the organization
Built MuSigma’s Artificial Intelligence-based assistant which acts as a layer of intelligence over MuSigma's CMS
Developed and maintained several backend services for client needs using REST APIs.
Improved algorithms and experimented with ML models for intent classification