Yasar Arafath A

Data Engineer

Chennai, Tamil Nadu, India11 yrs 4 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Architected distributed Spark pipelines processing 2+ TB daily.
  • Led migration from AWS/Snowflake to GCP with zero data loss.
  • Achieved 30-40% reduction in ETL runtime on Databricks.
Stackforce AI infers this person is a Data Engineering expert specializing in cloud migration and large-scale data platforms.

Contact

Skills

Core Skills

Data EngineeringCloud Migration

Other Skills

PySparkPython (Programming Language)Informatica PowerCenterAutosysArtificial Intelligence (AI)Google Cloud Platform (GCP)Data Build Tool (DBT)SnowflakeAmazon Web Services (AWS)Databricks ProductsData WarehousingData ModelingQlikViewShell ScriptingGoogle BigQuery

About

Professional Summary: ​🚀 Scaling Data Excellence through Modern Architecture ​As a Principal Data Engineer with 11+ years of experience, I specialize in designing and deploying large-scale distributed data platforms that bridge the gap between raw data and high-impact AI/ML workloads. My expertise spans the entire lifecycle of the Modern Data Stack, from real-time ingestion to sophisticated lakehouse orchestration. ​I have a proven track record of navigating complex multi-cloud environments (AWS & GCP), ensuring that data infrastructure is not just functional, but optimized for performance, cost, and scalability. ​🛠 Core Technical Expertise: ​Data Platforms: Snowflake, Databricks, BigQuery, Redshift ​Processing & Streaming: Apache Spark (EMR/Dataproc), Kafka, SQL Optimization ​Transformation & Ops: DBT, Airflow (Cloud Composer), Glue, GCS/S3 ​Next-Gen Tech: Vector Databases for AI-ready platforms & Large-scale Lakehouse design ​🏆 Key Career Milestones: ​Massive Scale: Architected distributed Spark pipelines on AWS EMR processing 2+ TB of daily data, directly enabling mission-critical ML workloads. ​Cloud Migration: Led a high-stakes migration from AWS/Snowflake to GCP, moving 1+ TB of analytical datasets with zero data loss and full reconciliation. ​Efficiency & Speed: Achieved a 30-40% reduction in ETL runtime by fine-tuning Spark workloads on Databricks and EMR. ​Modern ELT: Built a robust DBT architecture with 100+ models, drastically improving data lineage and development velocity. ​Real-Time Data: Designed Kafka-based event streaming to eliminate ingestion bottlenecks and reduce data latency for downstream consumers. ​Automation: Orchestrated 50+ production pipelines via Airflow with proactive monitoring and automated alerting. ​📫 Let’s Connect: I am passionate about building the next generation of data infrastructure. If you're looking to discuss Lakehouse architectures, Cloud Migrations, or AI-Ready Data Platforms, feel free to reach out!

Experience

Ltimindtree

2 roles

Senior Data Specialist

Jul 2025 – Present · 9 mos · Hybrid

PySparkPython (Programming Language)Data EngineeringCloud Migration

Senior Specialist

Jul 2025 – Present · 9 mos · Hybrid

Altimetrik

2 roles

Technical Lead

Aug 2024 – May 2025 · 9 mos · Chennai, Tamil Nadu, India · Hybrid

Senior Software Engineer

Dec 2019 – Apr 2024 · 4 yrs 4 mos · Hybrid

Python (Programming Language)Informatica PowerCenterData Engineering

Ust

System Analyst

Oct 2019 – Nov 2019 · 1 mo · Chennai, Tamil Nadu, India · On-site

Cognizant

Associate

Apr 2014 – Oct 2019 · 5 yrs 6 mos · On-site

Informatica PowerCenterAutosysData Engineering

Education

Chettinad College of Engineering & Technology

Bachelor of Engineering — Computer Science

Aug 2009 – Apr 2013

Stackforce found 100+ more professionals with Data Engineering & Cloud Migration

Explore similar profiles based on matching skills and experience