Rakesh Raushan

CEO

Bengaluru, Karnataka, India6 yrs 7 mos experience

Key Highlights

  • Active contributor to Apache Spark and Iceberg.
  • Expert in building scalable data platforms.
  • Achieved 4x performance gain in data pipelines.
Stackforce AI infers this person is a Backend-heavy Fullstack Engineer specializing in Data Infrastructure and Big Data solutions.

Contact

Skills

Core Skills

Apache SparkData Infrastructure

Other Skills

JavaScalaDistributed SystemsData AnalyticsPythonSpring BootApache Spark StreamingPySparkApache AirflowDbtBig DataHiveC++SQL

About

Software engineer with 6+ years of experience in building core data platforms and optimizing execution engines, with a deep focus on Apache Spark and Table Formats. I enjoy solving complex scalability problems and building tools that automate data engineering at scale. Active Apache Spark contributor and currently focused on Apache Iceberg. Spark PRs: https://github.com/apache/spark/pulls?q=is%3Apr+author%3AiRakson+is%3Aclosed+sort%3Acomments-desc

Experience

Oracle

Principal Member of Technical Staff

Nov 2024Present · 1 yr 4 mos · Bengaluru, Karnataka, India · Hybrid

  • Building and Scaling Self serve ETL Tool
  • Designed YAML driven self serve aggregation framework which reduced ad-hoc requests for custom aggregations
  • Built AIDP connector for our ETL Tool
  • Designed and Implemented a pipeline which would be used for determining the pricing strategy for Fusion Data Intelligence Team. This pipelines ingests various resource usage events and computes the cost using them.
  • Refactored existing pipelines resulting in 4x performance gain (12 hrs -> 3 hrs)
  • Orchestrated a zero-data-loss migration strategy for large-scale fact tables during complex schema evolutions.
Apache SparkJavaData Infrastructure

Prophecy

Data Engineer

Feb 2024Oct 2024 · 8 mos · Bengaluru, Karnataka, India · Hybrid

  • Helped customers in migrating their legacy data solutions to modern data solutions.
  • Developed and optimized prophecy spark pipelines for customers.
ScalaApache SparkData Infrastructure

Visa

Senior Software Engineer

Sep 2022Feb 2024 · 1 yr 5 mos · Bengaluru, Karnataka, India · Hybrid

  • As part of data platform team, helped team in building data fabric solution for users.
  • Implemented Job submission module which submits given queries to spark/hive/presto in sync/async mode.
  • Added Spark config tuner which generated configs for applications dependent on past execution statistics, heuristics based on table/partition stats.
  • Helped multiple teams in migrating their spark applications to spark3. Migrated 40+ applications.
ScalaApache SparkData Infrastructure

Huawei technologies india

Software Engineer

Jun 2019Aug 2022 · 3 yrs 2 mos · Bangalore

  • At Huawei, i was part of spark team for huawei's cloud distribution.
  • Spark SQL and Catalyst has been my major focus areas:
  • Introduced incremental statistics to spark. Currently, users need to run expensive ANALYZE TABLE command after data changing queries to keep statistics updated. With incremental update, stats will be updated automatically after every data changing command which would enhance CBO performance.
  • Dynamic UDF: Allows users to update their UDF definitions without restarting session.
  • Upgraded spark's built-in hive version to 3.1, at the time open source was still using 2.3.
  • Contributed to spark open source project. 25+ commits including two built-in functions, refactoring of pagination framework, 10+ bug fixes and some documentation changes.
ScalaApache SparkData Infrastructure

Education

Indian Institute of Technology, Delhi

Master of Technology - MTech — Computer Science

Jul 2017Jun 2019

Amity School of Engineering and Technology, IPU Delhi

Bachelor of Technology - BTech — Computer Science

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Apache Spark & Data Infrastructure

Explore similar profiles based on matching skills and experience