Arjun Ahuja

Senior Software Engineer

Toronto, Ontario, Canada7 yrs 8 mos experience

Key Highlights

  • Designed a notification platform for 100 million customers.
  • Optimized algorithms to save $10,000+ annually.
  • Expert in AWS and distributed systems.
Stackforce AI infers this person is a SaaS expert with strong capabilities in distributed systems and algorithm optimization.

Contact

Skills

Core Skills

PysparkGoogle Cloud Platform (gcp)MicroservicesSoftware ArchitectureComputer ScienceJavaAwsAlgorithms

Other Skills

FeastDataprocBigTableGoogle BigQueryHiveSonarqubePython (Programming Language)Apache AirflowDockerLarge Language Models (LLM)DatabasesRelational DatabasesLow LatencyDomain-Driven Design (DDD)Scalable Web Applications

About

I love solving problems and optimizing algorithms. I always have an inherent drive to see how an algorithm works, I try to question if this the best we can do? During my undergrad when I thought that finding the best algorithm for a problem is enough I was introduced to parallel and distributed systems, now I could make already fast algorithms even faster, I have always been like this then. Recently while working on redesign of Prediction Model, I observed that shuffle writes is atleast three times the input data, and then at the same time we were getting MemoryLimitExceeded issues in EMR(though the frequency was quite less). We were using spark's countDistinct for, I went ahead and root caused that most of the data which is stored by countDistinct not even required in the final solution hence not needed and there is no need to shuffle it. I formulated my own algorithm to work like countDistinct except the data which was based on bitsets and it worked 80% faster and with a 75% reduction in shuffle writes without any Memory issues. I have garnered experience in the following domains during by tenure at Amazon: * Worked on creating products with AWS Technologies (DynamoDB, CloudFormation, Data Pipeline, CloudWatch, EMR, Lambda, Elastic Search, SNS, SQS, RDS, EC2) * Created service and several of its APIs based on Java. * Redesigned and Optimized prediction model with PySpark. * Worked with front end technologies like HTML, CSS, Javascript, XML, Spring for Amazon Giveaway. I hope to work on more diverse projects solving customer problems and generating positive impact.

Experience

7 yrs 8 mos
Total Experience
1 yr 5 mos
Average Tenure
1 yr 4 mos
Current Experience

Amazon web services (aws)

Software engineer 2

Feb 2025Present · 1 yr 4 mos · Toronto, Ontario, Canada · Hybrid

  • Software Engineer for AWS Aurora

Walmart global tech

Senior, Software Engineer

Jun 2023Feb 2025 · 1 yr 8 mos · Sunnyvale, California, United States · Hybrid

  • Designed, coded, and managed a platform to schedule notifications for over 100 million customers.
  • Supported multiple A/B tests targeting different customer segments simultaneously, using configurable AI models.
  • Implemented the scheduling system using PySpark.
  • Developed and integrated in-house algorithms for enhanced functionality.
PySparkGoogle Cloud Platform (GCP)FeastDataprocBigTableGoogle BigQuery+6

Palantir technologies

Software Engineer

Jun 2022Apr 2023 · 10 mos · New York, New York, United States

  • SWE on Gotham Search
DatabasesRelational DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web Applications+5

Columbia university in the city of new york

3 roles

Course Assistant - Advisor

Jan 2022May 2022 · 4 mos

  • CA - advisor for COMS 4113 Distributed Systems
Computer Science

Head Teaching Assistant

Promoted

Sep 2021Dec 2021 · 3 mos

  • Head Teaching assistant for EECS 4750 Heterogeneous Computing:
  • Created assignments and exam for the course for programming in PyOpenCL and PyCuda covering concepts like, tiling, shared memory, constant memory optimizations.
  • Held office hours and evaluated assignments, projects and exam.
Computer Science

Graduate Teaching Assistant

Jan 2021May 2021 · 4 mos · New York, United States

  • Working as a Teaching Assistant for Programming Language Technologies.
Computer ScienceData Structures

Tusimple

Software Devepment Engineer Intern

May 2021Aug 2021 · 3 mos · California, United States

  • Heterogeneous Computing Team.
  • Worked on optimising Lidar and Perception algorithms for GPU. Learnt Software development in C++ using CMake, Jenkins, GTest and other required frameworks. Also learnt about CUDA and optimising algorithms using techniques like Tiling, Shared Memory and Constant Memory. Work was majorly in parallelising bunch of for loops and Matrix Multiplication and Transpose based algorithms. Also worked on optimising some OpenCV functions.
Computer ScienceData Structures

Amazon

3 roles

Software Development Engineer - 2

Sep 2020Dec 2020 · 3 mos

DatabasesRelational DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web Applications+5

Software Development Engineer

Promoted

Jul 2018Sep 2020 · 2 yrs 2 mos

  • Worked as an SDE for E-HVA Team, Jan 2020 - Dec 2020:
  • Worked in PySpark, to refactor and redesign 10000+ lines of code written by ML Scientists on Hidden Markov Model. Improved the worstcase runtime of the feature-generation by ~80% also saving on memory requirements by reducing shuffle writes by 75%, using a custom bitset based count distinct algorithm that I designed, this helped save $10000+ for the team per year. Improved runtime of training, scoring, pre-processing and postprocessing, by ~30% by optimizing on caching, partitions, and joins Currently working on automating on-boarding process for our customers.
  • Worked as an SDE for Data Ingestion, Jan 2019 - December 2019:
  • Designed, created and managed a java based service for scheduling jobs according to a cron schedule for big data processing.I created several APIs for CRUD and operational readiness. Used AWS DynamoDB as a database. Created a UI over Kibana using metadata from dynamoDB streams and indexed the metadata to AWS Elastic Search using AWS Lambda. This scheduler used to schedule 5000+ schedules daily with a availability rate of over 99.99%.
  • Worked as an SDE for Asin Data Service Team July 2018 - January 2019: Asin Data Service is a service on thousands of discrete applications running in a distributed environment that ensures that data from more than 40 services is aggregated for display on Amazon’s websites – 10s of thousands of times every second with minimal latency. A bad deployment to AsinDataService used to result in bad customer experience for atleast 4 hours in North Amerca region and an average of 2 hours in other region, in addition to this the team was also experiencing several issues during deployment(~High severity issue, sev-2 at Amazon) for which there was no root cause, it turned out that both were inter-related. I root caused the issue to JVM warmup. After the fix I put in place the deployment time reduced by 35+% and Number of issues dropped down to 0, from 22 per week.
DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web ApplicationsData Structures+4

SDE-Intern

May 2017Jul 2017 · 2 mos · Bangalore

  • Worked for Giveaway team: I was tasked with adding a new requirement for eligibility to Amazon Giveaway. I created seller side webpage "enter to giveaway with requirement as subscribe to newsletter". I created another customer side page where customer had to fulfill the requirement of subscribing to newsletter to be eligible for giveaway. For this project I worked on JAVA, HTML, JAVASCRIPT, XML, SPRING. Timely delivery and deployment to production by the end of my internship helped me get pre placement offer at Amazon.
Computer ScienceData Structures

Iit hyderabad

Teacher Assistant

Aug 2016Dec 2017 · 1 yr 4 mos · Hyderabad, Telangana, India

  • Teaching assistant for:
  • Introduction to programing in C
  • Advanced Data Structures and Algorithms
  • Responsibilities: Grade exams and assignments, conduct quizzes, conduct office hours to clear doubts.
DatabasesLow LatencyComputer ScienceDomain-Driven Design (DDD)Scalable Web ApplicationsData Structures+4

E-cell, iit hyderabad

Coordinator

Apr 2016May 2017 · 1 yr 1 mo · IIT Hyderabad

  • Key highlights:
  • A member of organizing committee for Megathon (Largest student lead hackathon in South India) 2017.
  • Other Responsibilities:
  • Conducting sessions, inviting speakers to promote startup culture within Campus.
Data Structures

Education

Columbia University

Master of Science - MS — Computer Science

Jan 2020Jan 2022

Indian Institute of Technology Hyderabad

Bachelor of Technology (BTech) — Computer Science

Jan 2014Jan 2018

freeCodeCamp

Full Stack Web Development Certification — Computer Software Engineering

Jan 2016Jan 2017

Stackforce found 100+ more professionals with Pyspark & Google Cloud Platform (gcp)

Explore similar profiles based on matching skills and experience