Nimish Bajaj

Software Engineer

Seattle, Washington, United States7 yrs 8 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Expert in building scalable data and ML solutions.
  • Proven track record in automating data pipelines.
  • Strong experience with cloud platforms and distributed computing.
Stackforce AI infers this person is a SaaS expert with a strong focus on data engineering and machine learning.

Contact

Skills

Core Skills

Machine LearningData Engineering

Other Skills

AWS EMRAWS LambdaAWS S3AWS SageMakerAlgorithmsAmazon Elastic MapReduce (EMR)Amazon KinesisAmazon S3Amazon Web Services (AWS)Apache KafkaApache SparkArtificial Intelligence (AI)CassandraClouderaCommunication skills

About

I love building robust and performant solutions around data, I am a big fan of distributed computing and love writing code that can scale well. My primary experience has been in automating and architecting data and ML pipelines, this has helped me learn about cloud platforms, databases, spark, APIs, Java, and Python.

Experience

Amazon

Software Development Engineer II

May 2023Present · 2 yrs 10 mos · Bellevue, Washington, United States · On-site

University of florida

2 roles

Graduate Teaching Assistant

Jan 2023May 2023 · 4 mos

  • TA for CIS6261 - Trustworthy Machine Learning

Graduate Research Assistant

Jan 2022Dec 2022 · 11 mos

  • Conducting research on methods for extracting and visualizing semantic differences in textual inputs.

Apple

Software Engineer Intern

May 2022Aug 2022 · 3 mos · Cupertino, California, United States

Ltimindtree

Senior Software Engineer

Oct 2020Jul 2021 · 9 mos · Bangalore Urban

  • Created the Spark Framework for Machine Learning Automation (AutoML). It's built to be scalable and efficient for a variety of tasks (binary/multi-class classification and regression) on tabular datasets with a variety of characteristics, including numeric, categorical, dates, texts, and so on.
  • Within three months, the framework was developed and integrated into the frontend of L&T's LymByc product, and it was used by five major clients for key driver analysis, data exploration, and insights development.
  • Designed and implemented a Model Management System for storing and retrieving PySpark models through S3.
  • Leading the project to develop the Auto-Tune framework, which tunes Spark tasks on clusters automatically. For tuning Spark workloads, the approach employs a Heuristics-based approach (Rule-based approach) and an Optimization-based strategy (Machine Learning).
  • Without requiring any human intervention, AutoTuning saves 30% of cluster resources and significantly improves the Spark Job success rate. It is used extensively in L&T's Lymbyc product.
  • Extensively worked with Spark, AWS EMR, and AWS S3
SparkAWS EMRAWS S3Machine LearningData Engineering

Quaero

Machine Learning Engineer

Jan 2020Oct 2020 · 9 mos · Bengaluru Area, India

  • Now acquired by CSG
  • Designed and built a complete package for handling end-to-end machine learning tasks, including data preprocessing, advanced feature development, cross validation, and hyperparameter tuning for various models. Also allows the user to generate model training and profiling reports in order to assess model outcomes and uncover insights not apparent from the initial dataset.
  • Developed scalable and modular microservices and optimized APIs utilizing multi threading in Python, reducing response time to less than 1 second
  • Developed mechanisms to launch, monitor, and terminate stateless Spark Clusters thereby saving 30\% in VM cost
  • Built ETL workflows on Spark achieving a 5X improvement from traditional Python workflow performance

Mu sigma inc.

Decision Scientist - Mu Sigma Innovation Lab

Sep 2017Oct 2019 · 2 yrs 1 mo · Bengaluru, Karnataka, India

  • As a decision scientist, I built and optimized data analytics infrastructures to help multiple global enterprise clients turn raw data into actionable insights. Creating and executing data pipelines and streamlining data operations.
  • I built a big data pipeline for a telecom client to generate key insights from users' web interactions data. For this project, I conducted extensive research into the Lambda architecture for processing both real-time and batch data. I built a real-time processing pipeline using AWS Kinesis and Spark Streaming to process the data at a rate of over 1 million records per second. Reduced query times by pre-computing batch and real-time views of the data, resulting in a significant decrease in query runtime from over 5 seconds to 300 milliseconds on average.
  • Created python notebooks and packages to solve NLP problems including intent classification, entity extraction, and topic modeling to be used across the organization
  • Built MuSigma’s Artificial Intelligence-based assistant which acts as a layer of intelligence over MuSigma's CMS
  • Developed and maintained several backend services for client needs using REST APIs.
  • Improved algorithms and experimented with ML models for intent classification

Education

University of Florida

Master of Science - MS — Computer Science

Aug 2021May 2023

Maharaja Surajmal Institute Of Technology

Bachelor of Technology (B.Tech.) — Computer Science

Jan 2013Jan 2017

Kendriya Vidyalaya

Jan 2001Jan 2013

Stackforce found 100+ more professionals with Machine Learning & Data Engineering

Explore similar profiles based on matching skills and experience

Nimish Bajaj - Software Engineer | Stackforce