Prashant Singh

Senior Software Engineer

San Francisco, California, United States8 yrs 3 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in optimizing Apache Spark performance.
Significant contributions to Apache Iceberg projects.
Strong background in AWS data engineering.

Stackforce AI infers this person is a Backend-heavy Fullstack Engineer specializing in SaaS data solutions.

Contact

Skills

Core Skills

Apache SparkAwsApache Iceberg

Other Skills

AWS FirehoseAWS RedshiftCC++Data EngineeringData StructuresElasticSearchGoHadoopJavaMySQLPHPRedis

About

SWE @ Snowflake

Experience

8 yrs 3 mos

Total Experience

3 yrs 5 mos

Average Tenure

4 yrs 7 mos

Current Experience

Snowflake

Senior Software Engineer

Feb 2025 – Present · 1 yr 3 mos

The apache software foundation

2 roles

Apache Spark Contributor

Jan 2022 – Present · 4 yrs 4 mos

PRs authored: https://github.com/apache/spark/pulls/singhpk234
PRs reviewed: https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+reviewed-by%3Asinghpk234+

Apache Iceberg Contributor

Oct 2021 – Present · 4 yrs 7 mos

PRs authored: https://github.com/apache/iceberg/pulls/singhpk234
PRs reviewed: https://github.com/apache/iceberg/pulls?q=is%3Apr+is%3Aopen+reviewed-by%3Asinghpk234+

Amazon web services (aws)

SDE2

Nov 2020 – Jan 2025 · 4 yrs 2 mos · On-site

Optimize EMR Spark Runtime : Added Adaptive JoinSelection, Improved ShuffledHashJoin codegen to make it more robust and remove unnecessary computation. Wrote catalyst optimizer rules to make spark identify redundant scans and let it come up with a more efficient re-written plan. Improved / Generalized existing rules for join reordering to make it more performant. During the course of above trimmed out a good chunk from overall sum and geomean from TPCDS and TPCH benchmarks of spark runtime.
Apache Iceberg in AWS EMR : Added optimized location provider for more efficient layout of files for objectstores like S3, DsV2 optimization for Apache Iceberg, Support for S3 Access Points for Disaster recovery, contributed heavily to OSS iceberg with numerous features, optimizations, bug fixes, PR reviews. Contributions :
https://github.com/apache/iceberg/pulls/singhpk234
Apache Iceberg in AWS Redshift: Natively integrated Apache Iceberg to all layers (optimizer, code generator and scanner) of Redshift. Increased performance of Redshift with Iceberg by integrating puffin to provide NDV stats to its CBO.
Apache Iceberg in AWS Data Firehose: Enhanced ability to ingest Change Data Capture to Apache Iceberg tables via Firehose. Designed and launched access control of Apache Iceberg tables via AWS Lakeformation when using Firehose. Designed and launched AWS S3 Tables (an Apache Iceberg fully managed catalog by AWS S3) with Firehose, which got demoed in
AWS re:invent 2024.Demo: https://www.youtube.com/watch?v=7ivaChj_KVA&t=2386s&ab_channel=AWSEvents

Apache SparkApache IcebergAWSData Engineering