Arjun Ahuja — Senior Software Engineer

I love solving problems and optimizing algorithms. I always have an inherent drive to see how an algorithm works, I try to question if this the best we can do? During my undergrad when I thought that finding the best algorithm for a problem is enough I was introduced to parallel and distributed systems, now I could make already fast algorithms even faster, I have always been like this then. Recently while working on redesign of Prediction Model, I observed that shuffle writes is atleast three times the input data, and then at the same time we were getting MemoryLimitExceeded issues in EMR(though the frequency was quite less). We were using spark's countDistinct for, I went ahead and root caused that most of the data which is stored by countDistinct not even required in the final solution hence not needed and there is no need to shuffle it. I formulated my own algorithm to work like countDistinct except the data which was based on bitsets and it worked 80% faster and with a 75% reduction in shuffle writes without any Memory issues. I have garnered experience in the following domains during by tenure at Amazon: * Worked on creating products with AWS Technologies (DynamoDB, CloudFormation, Data Pipeline, CloudWatch, EMR, Lambda, Elastic Search, SNS, SQS, RDS, EC2) * Created service and several of its APIs based on Java. * Redesigned and Optimized prediction model with PySpark. * Worked with front end technologies like HTML, CSS, Javascript, XML, Spring for Amazon Giveaway. I hope to work on more diverse projects solving customer problems and generating positive impact.

Stackforce AI infers this person is a SaaS expert with strong capabilities in distributed systems and algorithm optimization.

Location: Toronto, Ontario, Canada

Experience: 7 yrs 8 mos

Skills

Pyspark
Google Cloud Platform (gcp)
Microservices
Software Architecture
Computer Science
Java
Aws
Algorithms

Career Highlights

Designed a notification platform for 100 million customers.
Optimized algorithms to save $10,000+ annually.
Expert in AWS and distributed systems.

Work Experience

Amazon Web Services (AWS)

Software engineer 2 (1 yr 4 mos)

Walmart Global Tech

Senior, Software Engineer (1 yr 8 mos)

Palantir Technologies

Software Engineer (10 mos)

Columbia University in the City of New York

Course Assistant - Advisor (4 mos)

Head Teaching Assistant (3 mos)

Graduate Teaching Assistant (4 mos)

TuSimple

Software Devepment Engineer Intern (3 mos)

Amazon

Software Development Engineer - 2 (3 mos)

Software Development Engineer (2 yrs 2 mos)

SDE-Intern (2 mos)

IIT Hyderabad

Teacher Assistant (1 yr 4 mos)

E-Cell, IIT Hyderabad

Coordinator (1 yr 1 mo)

Education

Master of Science - MS at Columbia University

Bachelor of Technology (BTech) at Indian Institute of Technology Hyderabad

Full Stack Web Development Certification at freeCodeCamp

Arjun Ahuja

Senior Software Engineer

Toronto, Ontario, Canada7 yrs 8 mos experience

Key Highlights

Designed a notification platform for 100 million customers.
Optimized algorithms to save $10,000+ annually.
Expert in AWS and distributed systems.

Stackforce AI infers this person is a SaaS expert with strong capabilities in distributed systems and algorithm optimization.

Contact

Skills

Core Skills

PysparkGoogle Cloud Platform (gcp)MicroservicesSoftware ArchitectureComputer ScienceJavaAwsAlgorithms

Other Skills

FeastDataprocBigTableGoogle BigQueryHiveSonarqubePython (Programming Language)Apache AirflowDockerLarge Language Models (LLM)DatabasesRelational DatabasesLow LatencyDomain-Driven Design (DDD)Scalable Web Applications

About

Experience

7 yrs 8 mos

Total Experience

1 yr 5 mos

Average Tenure

1 yr 4 mos

Current Experience

Amazon web services (aws)

Software engineer 2

Feb 2025 – Present · 1 yr 4 mos · Toronto, Ontario, Canada · Hybrid

Software Engineer for AWS Aurora

Walmart global tech

Senior, Software Engineer

Jun 2023 – Feb 2025 · 1 yr 8 mos · Sunnyvale, California, United States · Hybrid

Designed, coded, and managed a platform to schedule notifications for over 100 million customers.
Supported multiple A/B tests targeting different customer segments simultaneously, using configurable AI models.
Implemented the scheduling system using PySpark.
Developed and integrated in-house algorithms for enhanced functionality.

PySparkGoogle Cloud Platform (GCP)FeastDataprocBigTableGoogle BigQuery+6

Palantir technologies

Software Engineer

Jun 2022 – Apr 2023 · 10 mos · New York, New York, United States

SWE on Gotham Search

DatabasesRelational DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web Applications+5

Columbia university in the city of new york

3 roles

Course Assistant - Advisor

Jan 2022 – May 2022 · 4 mos

CA - advisor for COMS 4113 Distributed Systems

Computer Science

Head Teaching Assistant

Promoted

Sep 2021 – Dec 2021 · 3 mos

Head Teaching assistant for EECS 4750 Heterogeneous Computing:
Created assignments and exam for the course for programming in PyOpenCL and PyCuda covering concepts like, tiling, shared memory, constant memory optimizations.
Held office hours and evaluated assignments, projects and exam.

Computer Science

Graduate Teaching Assistant

Jan 2021 – May 2021 · 4 mos · New York, United States

Working as a Teaching Assistant for Programming Language Technologies.

Computer ScienceData Structures

Tusimple

Software Devepment Engineer Intern

May 2021 – Aug 2021 · 3 mos · California, United States

Heterogeneous Computing Team.
Worked on optimising Lidar and Perception algorithms for GPU. Learnt Software development in C++ using CMake, Jenkins, GTest and other required frameworks. Also learnt about CUDA and optimising algorithms using techniques like Tiling, Shared Memory and Constant Memory. Work was majorly in parallelising bunch of for loops and Matrix Multiplication and Transpose based algorithms. Also worked on optimising some OpenCV functions.

Computer ScienceData Structures

Amazon

3 roles

Software Development Engineer - 2

Sep 2020 – Dec 2020 · 3 mos

DatabasesRelational DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web Applications+5

Software Development Engineer

Promoted

Jul 2018 – Sep 2020 · 2 yrs 2 mos

Worked as an SDE for E-HVA Team, Jan 2020 - Dec 2020:
Worked in PySpark, to refactor and redesign 10000+ lines of code written by ML Scientists on Hidden Markov Model. Improved the worstcase runtime of the feature-generation by ~80% also saving on memory requirements by reducing shuffle writes by 75%, using a custom bitset based count distinct algorithm that I designed, this helped save $10000+ for the team per year. Improved runtime of training, scoring, pre-processing and postprocessing, by ~30% by optimizing on caching, partitions, and joins Currently working on automating on-boarding process for our customers.
Worked as an SDE for Data Ingestion, Jan 2019 - December 2019:
Designed, created and managed a java based service for scheduling jobs according to a cron schedule for big data processing.I created several APIs for CRUD and operational readiness. Used AWS DynamoDB as a database. Created a UI over Kibana using metadata from dynamoDB streams and indexed the metadata to AWS Elastic Search using AWS Lambda. This scheduler used to schedule 5000+ schedules daily with a availability rate of over 99.99%.
Worked as an SDE for Asin Data Service Team July 2018 - January 2019: Asin Data Service is a service on thousands of discrete applications running in a distributed environment that ensures that data from more than 40 services is aggregated for display on Amazon’s websites – 10s of thousands of times every second with minimal latency. A bad deployment to AsinDataService used to result in bad customer experience for atleast 4 hours in North Amerca region and an average of 2 hours in other region, in addition to this the team was also experiencing several issues during deployment(~High severity issue, sev-2 at Amazon) for which there was no root cause, it turned out that both were inter-related. I root caused the issue to JVM warmup. After the fix I put in place the deployment time reduced by 35+% and Number of issues dropped down to 0, from 22 per week.

DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web ApplicationsData Structures+4

SDE-Intern

May 2017 – Jul 2017 · 2 mos · Bangalore

Worked for Giveaway team: I was tasked with adding a new requirement for eligibility to Amazon Giveaway. I created seller side webpage "enter to giveaway with requirement as subscribe to newsletter". I created another customer side page where customer had to fulfill the requirement of subscribing to newsletter to be eligible for giveaway. For this project I worked on JAVA, HTML, JAVASCRIPT, XML, SPRING. Timely delivery and deployment to production by the end of my internship helped me get pre placement offer at Amazon.

Computer ScienceData Structures

Iit hyderabad

Teacher Assistant

Aug 2016 – Dec 2017 · 1 yr 4 mos · Hyderabad, Telangana, India

Teaching assistant for:
Introduction to programing in C
Advanced Data Structures and Algorithms
Responsibilities: Grade exams and assignments, conduct quizzes, conduct office hours to clear doubts.

DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web ApplicationsData Structures+4

E-cell, iit hyderabad

Coordinator

Apr 2016 – May 2017 · 1 yr 1 mo · IIT Hyderabad

Key highlights:
A member of organizing committee for Megathon (Largest student lead hackathon in South India) 2017.
Other Responsibilities:
Conducting sessions, inviting speakers to promote startup culture within Campus.

Data Structures