SUMIT VANIYA

Data Engineer

Gandhinagar, Gujarat, India2 yrs 3 mos experience

Most Likely To Switch

Key Highlights

Designed scalable ETL pipelines for real-time analytics.
Awarded Best Data Engineer Employee for outstanding performance.
Expert in cloud-native data solutions and Kubernetes infrastructure.

Stackforce AI infers this person is a Data Engineering expert specializing in cloud-native solutions and big data processing.

Contact

sumit.vaniya@merillife.com LinkedIn

Skills

Core Skills

Data EngineeringBig Data Processing

Other Skills

Amazon Web Services (AWS)Apache KafkaApplication DeploymentArgoCDC++Cascading Style Sheets (CSS)CassandraClickHouseCommunicationDagsterData LakesDockerExpress.jsExtract, Transform, Load (ETL)Front-End Development

About

Results-driven Data Engineer with hands-on expertise in designing and scaling enterprise-grade ETL pipelines, cloud-native data lakes, and production-ready Kubernetes infrastructure. Skilled in building high-performance pipelines using Apache Kafka, Spark, PySpark, and Hudi, enabling seamless data ingestion, transformation, and real-time analytics. Proven experience in architecting end-to-end data platforms (Kafka → PySpark → Hudi → MinIO → Trino) with ACID compliance and automated quality checks, serving 100+ users with sub-second query response times. Experienced in Kubernetes, Docker, Helm, and ArgoCD, delivering zero-downtime migrations, GitOps-driven deployments, and multi-database integration (PostgreSQL, Cassandra, Neo4j, ClickHouse). Adept at leading cross-functional teams, mentoring engineers, and driving measurable improvements in productivity and system reliability. Recognized with the Best Data Engineer Employee Award and certifications in modern data orchestration. Key strengths include Data Engineering, Big Data Processing, Cloud Platforms (AWS), Workflow Orchestration (Dagster, Airflow, Flyte), and DevOps automation. Passionate about building scalable, resilient, and future-ready data systems that accelerate business insights and innovation. Let’s connect if you’re working on data-intensive systems, distributed infrastructure, or just want to chat about modern orchestration platforms!