Rohitash Jain

Data Engineer

Abu Dhabi, United Arab Emirates9 yrs 5 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Over 5 years of experience in data engineering.
Expert in optimizing ETL processes and data pipelines.
Proficient in Apache Spark and data warehousing solutions.

Stackforce AI infers this person is a Data Engineer specializing in SaaS and analytics solutions.

Contact

Skills

Core Skills

Data EngineeringApache Spark

Other Skills

AlgorithmsAnalytics platform developmentCC++Cost monitoringData StructuresData sharingETL optimizationETL pipelinesJavaJavaScriptLinuxProgrammingQuery optimizationSQL

About

Data Engineer with more than 5 years of experience in data products and data warehousing. I have a proclivity to massive datasets and complex business problems. I have worked with a couple of US-based startups and helped them solve their pain points around data processing and getting valuable insights from the data.

Experience

9 yrs 5 mos

Total Experience

3 yrs 1 mo

Average Tenure

6 yrs 2 mos

Current Experience

Alef education

2 roles

Senior Data Engineer

Promoted

Jun 2021 – Present · 4 yrs 11 mos · Abu Dhabi, United Arab Emirates

Data Engineer

Feb 2020 – May 2021 · 1 yr 3 mos · Abu Dhabi, United Arab Emirates

Qubole

Member Of Technical Staff II

Apr 2018 – Feb 2020 · 1 yr 10 mos · Bengaluru Area, India

Started my Qubole journey by optimizing old ETLs in the system, to get up to speed with the system.
Identified and reduced S3 listing calls cost. Migrated Hive jobs to Apache Spark and
tuned them for high resource utilization, to avoid resource wastage
Contributed to cost explorer for clients, enabling them to monitor and track cost at the query, user, cluster level
Productionised Bifrost, Qubole’s in-house platform for data sharing with customers over s3.
Sharing insightful data with 1000s of customer accounts daily.

ETL optimizationApache SparkCost monitoringData sharingData Engineering

Sigmoid

Software Developer

Nov 2016 – Apr 2018 · 1 yr 5 mos · Bengaluru Area, India

Built a powerful analytics platform for the Media industry.
Key features include query response time on hundreds of terabytes of data in seconds, real-time alerting and reporting, cloud-enabled environment, SQL-enabled with optimization on count distinct queries, ad-hoc analytics, REST, and Thrift APIs, etc. Involved in setting up ETL pipelines on the scale of 100 GB per hour for various clients. Worked on query optimization on hundreds of terabytes of data. Worked on client-side for orchestration of multiple ETL pipelines using Airflow. Worked on customization of our platform to accommodate various client requirements like data replacement, collating data from multiple sources. Have good knowledge of the internals of Apache Spark. Have worked extensively on spark job optimization. Can identify the root cause of slowness of a spark job.

Analytics platform developmentETL pipelinesQuery optimizationApache SparkData Engineering

Morgan stanley

Intern

Jan 2016 – Jun 2016 · 5 mos · Mumbai Area, India

Worked in the Risk Aggregation team of Morgan Stanley as a spring intern.
Worked on an internal web dashboard. Contributed by writing two end to end applications.
Responsible for creating view(UI), service(REST) and an automation script(PERL).