Kalpesh Shimpi

Data Engineer

Bengaluru, Karnataka, India8 yrs 10 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in data engineering solutions using Apache Spark.
  • Proven track record in optimizing ETL workflows.
  • Strong background in Hadoop administration and support.
Stackforce AI infers this person is a Data Engineering specialist with extensive experience in Hadoop and Spark technologies.

Contact

Skills

Core Skills

Data EngineeringApache SparkHadoop AdministrationLinux System Administration

Other Skills

AWSAirflowAmazon Web Services (AWS)AmbariApache KafkaApache NiFiApache serverAzureBig DataCSVCitrix XenDeltaDelta Live TablesGCPHBase

Experience

8 yrs 10 mos
Total Experience
2 yrs 2 mos
Average Tenure
3 yrs 6 mos
Current Experience

Databricks

Data Engineer

Dec 2022Present · 3 yrs 6 mos · Bengaluru, Karnataka, India · Remote

  • Delivered expert consultation to Databricks clients, offering guidance on data engineering solutions using Spark SQL, Delta, Delta Live Tables, Unity Catalog.
  • Scheduled the ETL workflows utilising the PySpark on Databricks using Airflow.
  • Experience in SQL query optimization and performance tuning.
  • Integrated and processed billing data from multiple sources including AWS, Azure, GCP, handling various file formats like JSON,CSV, and Parquet.
  • Implemented Spark optimization techniques such as caching, multithreading, and broadcast joins, resulting in a 20% decrease in processing time for handling a daily load of around 2 Million records.
  • Expert in performance tuning techniques on Spark applications using the Databricks Platform, optimizing data processing and improving overall system performance.
  • Identify root causes of problematic Spark applications and deliver effective solutions to mitigate issues, minimizing downtime and enhancing the reliability of data pipelines
  • Deployed databricks workspace, workflows using notebook on SQL Warehouse, Job compute, serverless compute using terraform.
Spark SQLDeltaDelta Live TablesUnity CatalogPySparkAirflow+15

Amazon web services (aws)

Cloud Big Data Engineer

Aug 2021Dec 2022 · 1 yr 4 mos · Bangalore Urban, Karnataka, India

Cloudera

Hadoop admin

Oct 2018Aug 2021 · 2 yrs 10 mos · Bangalore Urban, Karnataka, India

  • 1. Providing remote support to customers using Hortonworks Hadoop Data Platform(HDP),Cloudera Data Platform product across the Globe.
  • 2. Remote support involves support on HDP/Apache hadoop products such as HDFS, YARN, MapReduce, Spark, Tez, Oozie, Hive, Hbase, Ambari, Ranger, Knox, Ranger KMS, NiFi, Kafka, Zookeeper, Kerberos.
  • 3. Remote support includes taking live calls on production down situations understanding customer’s environment & troubleshooting problem, helping customer to get business running.
  • 4. Troubleshooting skills involves collect and analyze Smartsense bundles, Various log files from above mentioned components.
  • 5. Troubleshooting Priority 1 issues with webex and resolving the issue.
  • 6. Provide Root Cause analysis post the issue is resolved.
  • 7. Reproduce complex customer issues in our own openstack LAB environment and provide the workaround and fix accordingly.
HadoopHDFSYARNMapReduceSparkTez+11

Esds software solution pvt ltd.

Linux System Administrator

Jul 2017Sep 2018 · 1 yr 2 mos · nashik

  • As a member of Linux tech support, my daily routine consists of acknowledging customers' issues ranging from website problems, server outages, routine maintenance tasks, along with working on cloud and virtualization technologies. On a regular day at the office I'm responsible for the following jobs:
  • Connecting with clients through tickets and chats and resolving their issues on a timely basis.
  • Create and deploy virtual machines on our data centre based in Mumbai and Nashik using various cloud platforms (VMWare ESXi, HyperV, and Citrix Xen).
  • Configure network settings in order to access servers.
  • Create and manage disk partitions (read LVM), file formats.
  • Install software (MySQL, Apache server, Java, etc) to meet client requirements.
  • Install virtual firewalls in order to filter network traffic.
  • Basic Linux troubleshooting and automation.
  • Configure monitoring of servers and defining various alerts (ping offline, HTTP, CPU and HDD usage above the threshold, etc.)
  • Configure backup agents using R1soft backup and Veeam backup management software
LinuxVMWare ESXiHyperVCitrix XenMySQLApache server+4

Education

PSGVPM'S D.N.PATEL COLLEGE OF ENGINEERING, SHAHADA, DIST-NANDURBAR.

D N Patel college of Engineering — Computer Engineering

Stackforce found 100+ more professionals with Data Engineering & Apache Spark

Explore similar profiles based on matching skills and experience