Vedant Chimmad — Data Engineer

• Oversaw the execution and monitoring of Azure Data Pipelines, ensuring the timely and accurate processing of data. • Implemented file handling mechanisms to initiate data processing upon file arrival. • Developed and maintained file name pattern recognition logic to identify and validate incoming data files. • Created and optimized SQL queries to perform data validation, integrity checks, and transformation processes. • Enforced data quality standards and conducted thorough data quality checks to identify and rectify anomalies and discrepancies. • Maintained comprehensive documentation of data pipeline processes, file handling procedures, and rule definitions for future reference. Generated and presented regular reports on data processing activities and performance. • Collaborated effectively with cross-functional teams to address project challenges and ensure seamless data flow. • Actively participated in troubleshooting and resolution of data-related issues. • Actively contributed to process optimization and efficiency enhancements within the data pipeline, streamlining workflows and reducing processing times. • Proficient in translating complex SQL queries into PySpark code, ensuring seamless integration and execution within data pipelines. • Extensive experience with PySpark’s core functions and APIs, enabling efficient data manipulation and processing within distributed environments. • Familiarity with AWS Glue’s architecture and its integration with PySpark, leveraging Glue’s capabilities to manage ETL processes effectively. • Skilled in developing robust PySpark scripts for data transformation and analysis within AWS Glue, ensuring accuracy • Adept at identifying and resolving issues in PySpark code, utilizing debugging tools and techniques to ensure smooth data processing workflows. • Responsible for the continuous maintenance and enhancement of PySpark scripts within AWS Glue, ensuring they remain up-to-date with evolving data requirements and best practices. • Integrated AWS glue job and Databricks workflow job with Apache airflow

Stackforce AI infers this person is a Data Engineering specialist with expertise in cloud technologies and data pipeline optimization.

Location: Bengaluru, Karnataka, India

Experience: 4 yrs 6 mos

Skills

Data Engineering
Data Pipelines
Etl Processes

Career Highlights

Expert in building and optimizing data pipelines.
Proficient in SQL and PySpark for data transformation.
Strong collaboration skills with cross-functional teams.

Work Experience

Genpact

Data engineering (1 yr 11 mos)

Tata Consultancy Services

Data Engineer (2 yrs 7 mos)

Q Spider

Test Automation Engineer (7 mos)

Education

Bachelor of Engineering - BE at Visvesvaraya Technological University

Vedant Chimmad

Data Engineer

Bengaluru, Karnataka, India4 yrs 6 mos experience

Key Highlights

Expert in building and optimizing data pipelines.
Proficient in SQL and PySpark for data transformation.
Strong collaboration skills with cross-functional teams.

Stackforce AI infers this person is a Data Engineering specialist with expertise in cloud technologies and data pipeline optimization.

Contact

Skills

Core Skills

Data EngineeringData PipelinesEtl Processes

Other Skills

AWSAWS Command Line Interface (CLI)AWS GlueAWS Identity and Access Management (AWS IAM)AWS LambdaAirflowAmazon CloudWatchAmazon S3Automated AlertsAzure Data FactoryAzure Data LakeAzure Data PipelinesAzure FunctionsBig DataData Bricks

About

Experience

4 yrs 6 mos

Total Experience

2 yrs 6 mos

Average Tenure

1 yr 11 mos

Current Experience

Genpact

Data engineering

Jul 2024 – Present · 1 yr 11 mos · Bengaluru, Karnataka, India · Hybrid

Azure Data PipelinesSQLPySparkAWS GlueData QualityData Engineering+1

Tata consultancy services

Data Engineer

Dec 2021 – Jul 2024 · 2 yrs 7 mos · On-site

Worked In AWS and Data Bricks cloud Technology to build data pipeline from different sources.
Extracted data from SQL server to AWS S3 Using AWS Glue job.
I have built the data pipeline to push the data from SharePoint to AWS S3 using Power Automate.
Extracted data from SFTP Server to AWS S3 using AWS Lambada and event bridge
Ingested data coming to AWS S3 as a external Delta table in Data Bricks
Done the transformation logic as for the business requirement using PySpark
Technology : Python, AWS Glue, AWS Lambda, AWS RDS,
AWS S3, Pyspark, SQL, Airflow, Data Bricks, Pytest