Amarjeet Singh

Software Engineer

Tempe, Arizona, United States6 yrs 10 mos experience

Key Highlights

Expert in AWS and ETL processes.
Reduced data validation efforts by 70%.
Developed real-time data processing applications.

Stackforce AI infers this person is a Data Engineer specializing in SaaS solutions with expertise in ETL and cloud technologies.

Contact

milkycam@yahoo.com LinkedIn

Skills

Core Skills

AwsEtlApache SparkHadoopSoftware Development

Other Skills

AWS GlueAWS LambdaAWS SQSAgile MethodologiesAmazon EC2Amazon RedshiftAmazon Web Services (AWS)Apache KafkaApache NiFiC (Programming Language)C++Computer ScienceData StructuresDockerElasticsearch

About

Data Engineer with experience in Bigdata, Hadoop, DataEngineering and AWS. • Hands-on experience in using HADOOP ecosystem components like HDFS, Hive, Sqoop, Spark. • Good knowledge on Amazon Web services - EC2, S3, Glue etc. • Ability to code in Python, Java, and SQL. • Good experience in project life cycle and ETL transformations that include Data Acquisition, Data Cleansing, Data Manipulation, Data Validation, Data Mining, and Visualization. Keyword SQL, Python, AWS,ETL, Apache Spark, PowerBI, Data Pipelines, Data Engineering, Data Analytics.

Experience

6 yrs 10 mos

Total Experience

1 yr 5 mos

Average Tenure

1 yr 6 mos

Current Experience

Asm

Software Engineer II

Nov 2024 – Present · 1 yr 6 mos · Phoenix, Arizona, United States

Rocket central

Big Data Engineer Intern

May 2023 – Aug 2023 · 3 mos · Detroit, Michigan, United States · Hybrid

Built a configurable job for automating the data validation in S3 bucket (checking if data is being there in S3 as per the required intervals)utilizing Python and AWS Lambda, reducing manual validation efforts by 70%.
Extracted, aggregated, and consolidated data (CSV) within AWS Glue using PySpark from S3 (job to fetch the data from the transformed layer and pushing it into the analytical layer)and loaded it into Redshift for analysis.
Achieved a 40% reduction in Power BI dashboard load time, saving $1,000 monthly by optimizing Athena queries.
Maintained data integrity and consistency through automation of repetitive validation tasks.
Processed and analyzed over 5 TB of data (CSV) within AWS Glue using PySpark and loading it into Redshift for analysis.

Amazon Web Services (AWS)MySQLPython (Programming Language)Extract, Transform, Load (ETL)AWS GlueAmazon Redshift+2

Arizona state university

Graduate Teaching Assistant and Services Assistant

Jan 2023 – May 2024 · 1 yr 4 mos · Tempe, Arizona, United States

Performs functions in support of testing and grading of assignments, quizzes, and exams
. Engaged with 90 undergraduate students during office hours, addressing questions related to the course’s materials and assignments.
. Received positive feedback from faculty and student for outstanding support and dedication to foster the learning experience.

Knoldus inc

Software Consultant

Jun 2021 – Aug 2022 · 1 yr 2 mos · Noida, Uttar Pradesh, India

Developed the Spark structured streaming application in Java, integrating with Kafka for real-time JSON data processing and pushing the processed data into the Elasticsearch.
Implemented checkpointing in Spark to save the state of streaming applications, allowing them to recover gracefully from failures.
Ensured data integrity and fault tolerance by regularly saving the intermediate processing state (checkpointing).
Created a job to retrieve data from Elasticsearch and push back data into Kafka, ensuring data recovery in case of the checkpointing corruption.
Worked closely with the client and stakeholders to elicit and gather business data requirements.

Software DevelopmentApache SparkElasticsearchLinuxGitScala+1

Jio platform limited

Software Developer

Aug 2018 – Jun 2021 · 2 yrs 10 mos · Navi Mumbai, Maharashtra, India · On-site

Programmed an automatic Employee ID number assignment and ID card printing application for HR team with Java, Spring framework, and MS SQL database and integrated it with the existing portal.
Designed ETL pipeline by using Apache NiFi to streamline data integration and data validation process(fetching the data from SFTP servers and pushing it into HDFS after processing), reducing manual data handling by 40%.
Implemented and maintained ETL pipeline to ingest transactional and event data into external hive tables for nearly 60,000 employees.
Developed stored procedures to streamline database operations, reducing manual intervention and increasing efficiency.
Enhanced data processing and retrieval times by optimizing existing stored procedures, improving overall system efficiency.
Collaborated with different cross functional teams to identify and resolve the production issues, resulting in 20% reduction in application downtime, enhancing the overall system stability.

Software DevelopmentScriptingHadoopZoomdataJavaGit+3