Rohitash Jain

Data Engineer

Abu Dhabi, United Arab Emirates9 yrs 3 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Over 5 years of experience in data engineering.
  • Expert in optimizing ETL processes and data pipelines.
  • Proficient in Apache Spark and data warehousing solutions.
Stackforce AI infers this person is a Data Engineer specializing in SaaS and analytics solutions.

Contact

Skills

Core Skills

Data EngineeringApache Spark

Other Skills

AlgorithmsAnalytics platform developmentCC++Cost monitoringData StructuresData sharingETL optimizationETL pipelinesJavaJavaScriptLinuxProgrammingQuery optimizationSQL

About

Data Engineer with more than 5 years of experience in data products and data warehousing. I have a proclivity to massive datasets and complex business problems. I have worked with a couple of US-based startups and helped them solve their pain points around data processing and getting valuable insights from the data.

Experience

Alef education

2 roles

Senior Data Engineer

Promoted

Jun 2021Present · 4 yrs 9 mos · Abu Dhabi, United Arab Emirates

Data Engineer

Feb 2020May 2021 · 1 yr 3 mos · Abu Dhabi, United Arab Emirates

Qubole

Member Of Technical Staff II

Apr 2018Feb 2020 · 1 yr 10 mos · Bengaluru Area, India

  • Started my Qubole journey by optimizing old ETLs in the system, to get up to speed with the system.
  • Identified and reduced S3 listing calls cost. Migrated Hive jobs to Apache Spark and
  • tuned them for high resource utilization, to avoid resource wastage
  • Contributed to cost explorer for clients, enabling them to monitor and track cost at the query, user, cluster level
  • Productionised Bifrost, Qubole’s in-house platform for data sharing with customers over s3.
  • Sharing insightful data with 1000s of customer accounts daily.
ETL optimizationApache SparkCost monitoringData sharingData Engineering

Sigmoid

Software Developer

Nov 2016Apr 2018 · 1 yr 5 mos · Bengaluru Area, India

  • Built a powerful analytics platform for the Media industry.
  • Key features include query response time on hundreds of terabytes of data in seconds, real-time alerting and reporting, cloud-enabled environment, SQL-enabled with optimization on count distinct queries, ad-hoc analytics, REST, and Thrift APIs, etc. Involved in setting up ETL pipelines on the scale of 100 GB per hour for various clients. Worked on query optimization on hundreds of terabytes of data. Worked on client-side for orchestration of multiple ETL pipelines using Airflow. Worked on customization of our platform to accommodate various client requirements like data replacement, collating data from multiple sources. Have good knowledge of the internals of Apache Spark. Have worked extensively on spark job optimization. Can identify the root cause of slowness of a spark job.
Analytics platform developmentETL pipelinesQuery optimizationApache SparkData Engineering

Morgan stanley

Intern

Jan 2016Jun 2016 · 5 mos · Mumbai Area, India

  • Worked in the Risk Aggregation team of Morgan Stanley as a spring intern.
  • Worked on an internal web dashboard. Contributed by writing two end to end applications.
  • Responsible for creating view(UI), service(REST) and an automation script(PERL).

Education

Indian Institute Of Information Technology Allahabad

Bachelor of Technology - BTech — Electronics and Communications Engineering

Jan 2012Jan 2016

Stackforce found 100+ more professionals with Data Engineering & Apache Spark

Explore similar profiles based on matching skills and experience