Piyush Lathiya

Product Engineer

Bengaluru, Karnataka, India9 yrs 7 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • 9 years of experience in Big Data and Machine Learning.
  • Expert in developing real-time data pipelines and optimizing ETL processes.
  • Strong background in Data Mesh and cloud migration strategies.
Stackforce AI infers this person is a Data Engineering expert in SaaS with strong cloud migration and big data capabilities.

Contact

Skills

Core Skills

Big DataData EngineeringCloud Migration

Other Skills

AWSAWS EMRAWS GlueAgile MethodologiesAirflowAmazon Web Services (AWS)Apache AirflowApache BeamApache FlinkApache HudiAzure Cosmos DBAzure Data FactoryAzure Data LakeAzure DatabricksC

About

● 9 years of total experience in the implementing Analytics, IoT and Machine learning products & solutions on Big Data platforms including enterprise scale data lake systems as well as startups. ● Good Exposure to Spark, Spark-streaming, Kafka, Zookeeper, Redis with Python, Pyspark, Java. ● Experience using various Hadoop Distributions (Hortonworks, Cloudera) as well as cloud services (AWS, Azure, GCP) to fully implement large scale solutions. ● Worked extensively on performance optimization & automation. ● Research work in the field of Machine learning using different Clustering and other Algorithms and published research paper in IEEE conference. ● Sound understanding of modern data architecture such as Data Lake House and Data Mesh and worked on Data Mesh Implementation. ● Worked on implementing and testing APIs to retrieve engagement metrics from database. ● Having knowledge of different Data Metadata/catalog concepts and Implemented Datahub setup for data cataloging. ● Ability to find more efficient and re-usable solutions. ● Over the time gain knowledge of business processes, agile methodologies and expertise in problem solving using different opensource tools/technologies.

Experience

9 yrs 7 mos
Total Experience
1 yr 11 mos
Average Tenure
2 yrs 8 mos
Current Experience

Asper.ai

Lead Data Engineer

Oct 2023Present · 2 yrs 8 mos · Bengaluru, Karnataka, India · Hybrid

Bluejeans by verizon

Lead Data Engineer

Aug 2021Oct 2023 · 2 yrs 2 mos · Bengaluru, Karnataka, India · Hybrid

  • ● Developed Realtime data pipelines to process meetings, events and different data using PySpark, Kafka, EMR and Stored in Datalake (size around 300 TB), processed to warehouse Redshift for different use cases.
  • ● HLD/LLD designed for different projects, Architecture designs for new features and POC creations.
  • ● Worked with data scientists to generate Text Summary from meetings.
  • ● Worked on Generating Denormalized tables in Athena and retrieve Engagement Score metrics for user and developed API to retrieve this information on web page with Date range.
  • ● Optimizing many ETL pipelines for performance and scalability. Migrated many ETL job from EC2 to serverless AWS Glue and monitored and triggered by Apache Airflow which reduce the overall cost by 40-45%.
  • ● Guiding, Leading and training juniors.
  • ● Worked on Implementing Data Mesh key pillars by designing Data Catalog/ Metadata solutions.
  • Technologies used:
  • ● PySpark, Python, SparkSQL, Kafka, EMR, Airflow, Redshift, Glue, Lambda, AWS API Gateway, Athena, DynamoDB, CodeCommit, Kubernetes, Docker, Datahub, MSK, EKS
PySparkKafkaEMRDatalakeRedshiftAWS Glue+5

Bookmyshow

2 roles

Team Lead

Promoted

Mar 2021Aug 2021 · 5 mos

  • ● Hudi Data Platform: Develop a central data platform catering all the needs of the data department and maintaining a central data repository using Scala/Spark and Apache Hudi.
  • ● Hands on contribution in design and development of data platform services which include various data services e.g., CDC Incremental load, Streaming load, Schema resolver, dynamic pipeline creation, etc.
  • ● Hands on experience in Cloudera cluster deployment, Docket image deployment.
  • ● Led the Cloud Migration from on-prem legacy data center (Cloudera cluster, databases etc.) to AWS cloud system using AWS services like AWS EMR, S3, EC2, Glue, Redshift, Airflow and other AWS services.
ScalaSparkApache HudiAWSAirflowData Engineering+1

Data Engineer 2

Jan 2020Mar 2021 · 1 yr 2 mos

A.p. moller - maersk

Data Engineer

Feb 2018Jan 2020 · 1 yr 11 mos · Bengaluru, Karnataka, India

Sensight technologies private limited

Data Engineer

Nov 2016Feb 2018 · 1 yr 3 mos · Bengaluru Area, India

Education

Thapar Institute of Engineering & Technology

Master of Engineering - MEng — Computer Software Engineering

Jan 2014Jan 2016

SVIT, Vasad Official

Bachelor’s Degree — Computer Engineering

Jan 2009Jan 2013

Stackforce found 100+ more professionals with Big Data & Data Engineering

Explore similar profiles based on matching skills and experience