Ganesha S.

Software Engineer

Bengaluru, Karnataka, India10 yrs 2 mos experience

Highly StableAI ML Practitioner

Key Highlights

Expert in performance tuning of Apache Spark workloads.
Proven track record in optimizing cloud costs and enhancing application performance.
Strong leadership in mentoring and technical training.

Stackforce AI infers this person is a Backend-heavy Fullstack Engineer specializing in Big Data and Cloud technologies.

Contact

Skills

Core Skills

Apache SparkPerformance TuningSpark Structured StreamingReal-time AnalyticsHiveTezApplication Resiliency

Other Skills

AWSAWS Glue Data CatalogAWS S3AlgorithmsApache PigApache Spark StreamingAutomationBFSIBig DataBug FixesC (Programming Language)C++ClickHouseClickhouseContainer Optimization

About

Experienced Software Engineer with a demonstrated history of working in the Big Data and Cloud platforms. Skilled in Distributed Systems, AWS Cloud, Databricks, Java, Python, Scala, SQL, Shell Scripting, and the internals of Spark, Hive, Tez and Hadoop.

Experience

10 yrs 2 mos

Total Experience

2 yrs 10 mos

Average Tenure

1 yr 6 mos

Current Experience

Databricks

2 roles

Staff Backline Engineer - Data & AI

Promoted

Nov 2025 – Present · 7 mos · Hybrid

Senior Backline Engineer

Nov 2024 – Oct 2025 · 11 mos · Hybrid

Conduct in-depth performance analysis of Apache Spark workloads, identifying opportunities to reduce latency, enhance throughput, and improve resource utilization.
Apply advanced performance tuning techniques to Spark applications, including configuration optimization, caching strategies, and adjusting execution parameters to enhance performance.
Review and analyze Spark job code to ensure adherence to best practices for performance, scalability, and maintainability; provide recommendations for improvements.
Refactor and optimize Spark code to enhance the efficiency and performance of data processing pipelines.
Leverage thread dump analysis and flame graphs to diagnose performance bottlenecks in slow-running jobs, enabling targeted performance tuning.
Perform heap dump analysis using profilers such as YourKit to identify memory inefficiencies and optimize memory usage across Spark applications.
Improve the stability and performance of Databricks components - including Spark, Delta Lake, and Unity Catalog - by identifying bugs and optimization opportunities, and collaborating with the core Engineering team.
Conduct technical trainings on Spark internals and best practices to upskill the team members.
Participate in hiring to hire the best engineers.

Apache SparkPerformance TuningSpark InternalsData Processing PipelinesThread Dump AnalysisHeap Dump Analysis+2

Mobileum

Technical Lead

Sep 2024 – Nov 2024 · 2 mos · Bengaluru, Karnataka, India · Hybrid

Implemented support for writing streaming data to ClickHouse through Spark Structured Streaming for real-time analytics.
Developed a multi-sink writer and a customized committer for writing streaming data to multiple sinks using Spark Structured Streaming.
Analyzed and identified various ways to improve the performance of writing streaming data to ClickHouse.

ClickHouseSpark Structured StreamingReal-time AnalyticsPerformance Improvement

Amazon web services (aws)

Software Development Engineer ll

Dec 2020 – Jun 2024 · 3 yrs 6 mos · Bengaluru, Karnataka, India · Hybrid

Conducted successful Proof of Concepts (PoCs) to support shuffle data handling with Tez on EMR Serverless, resulting in the development of a native shuffle service for Hive on EMR Serverless.
Reduced the cost of running Hive jobs by optimizing container allocation, integrating Hive with EMR serverless and implementing features to pre-empt tasks instead of containers.
Achieved 2x performance improvement of EMR Hive by integrating S3 Express One Zone service with EMR Hive, optimizing container allocation strategies, and implementing a feature to utilize HDFS as a scratch directory.
Enhanced security by implementing TLS 1.3 across Hive endpoints, improving data integrity and compliance.
Added support for using AWS Glue Data Catalog as the metastore for Iceberg with HiveCatalog, in addition to the already supported Hive Metastore, ensuring robust and flexible data management.
Resolved critical bugs in EMR Hive, analyzed numerous bug fixes in the open-source Hive and worked on backporting them to EMR Hive, contributing to improved system resiliency and stability.
Improved debuggability of Hive query failures by separating logs, enhancing error messages, and refactoring code.
Conducted root cause analysis of multiple critical customer issues and provided solutions in time.
Mentored team members, conducted technical interviews, and served as Scrum Master, ensuring alignment with project milestones and objectives.

TezHiveEMR ServerlessContainer OptimizationTLS 1.3AWS Glue Data Catalog+2

Qubole

Member of Technical Staff

Jun 2017 – Nov 2020 · 3 yrs 5 mos · Bengaluru, Karnataka, India

Optimized cloud costs by integrating Lustre FSx as a shuffle store in Qubole’s Hive and Tez applications.
Developed Hive Metastore Thrift API to streamline temporary table operations, resulting in improved application performance.
Upgraded Hive, Presto, and Pig engines following major version releases, ensuring compatibility and performance enhancements.
Resolved critical customer issues by debugging and contributing fixes to open-source Apache Hive, improving the stability.
Achieved 4x performance boost of Qubole’s Hive engine by fine-tuning AWS S3 data processing and optimizing configurations.
Contributed to LLAP support in Qubole’s Hive, enhancing query execution and overall user experience.

Lustre FSxHiveTezPrestoPigAWS S3+1

Flipkart

Product Solution Engineer ll

Sep 2015 – Jun 2017 · 1 yr 9 mos · Bengaluru, Karnataka, India

Improved application resiliency by resolving hundreds of issues through bug fixes, minor enhancements, and thorough documentation of known problems.
Increased productivity and reduced manual workload by automating time-consuming tasks.
Set up cron jobs, dashboards, and alerts to monitor application and infrastructure health, enabling proactive detection of critical issues.
Served as a SME with end-to-end product knowledge, guiding product managers, operations, development, and support teams as needed.
Provided on-call support to assist the support team in diagnosing and resolving critical production issues.
Led the support team and mentored them in effectively handling production issues.

Bug FixesAutomationMonitoringSupportApplication Resiliency