Pratibha Malviya

Data Engineer

Bhopal, Madhya Pradesh, India4 yrs 5 mos experience

Most Likely To Switch

Key Highlights

Developed and optimized over 60 ETL data pipelines.
Reduced cluster costs by 15% through effective resource management.
Adept at mentoring team members and adapting to project needs.

Stackforce AI infers this person is a Data Engineering expert with a strong focus on ETL processes and cloud technologies.

Contact

Skills

Core Skills

PysparkSqlSplunkPython

Other Skills

Agile MethodologiesAmazon EC2Amazon RedshiftAmazon S3Amazon Web Services (AWS)AppianAutomationAutomation StudioAzure DatabricksCascading Style Sheets (CSS)Continuous Integration and Continuous Delivery (CI/CD)Core JavaData AnalysisData ModelingDatabricks

About

Results-driven Data Engineer with 3.5+ years of experience in designing scalable data solutions. Proficient in Python, PySpark, SQL, Databricks, Unity Catalog, Splunk, and CI/CD. Skilled in building data pipelines, optimizing workflows, and managing complex datasets for actionable insights and operational efficiency. Adept at mentoring team members, adapting to changing project needs, and taking on additional responsibilities to drive business success. Always open for learning on new technology and opportunity that will help me to gain more experience.

Experience

In time tec

Data Engineer

Jul 2023 – Present · 2 yrs 8 mos · Jaipur, Rajasthan, India · On-site

Developed and optimized 60+ ETL data pipelines using Pyspark,SQL and Optimization Techniques.
Maintained 100+ data pipelines in Databricks for efficient data processing.
Reduced cluster costs by 15% through effective use of Autoscaling and Right EC2 Instance node type.
Created and scheduled various ETL workflows on Databricks.
Refactored 80+ jobs using clean code principles to improve readability, and performance.
Designed and implemented test cases for various data pipelines using PySpark and SparkSQL to validate data quality
Created Splunk dashboards to monitor data count trends, track job run statuses, and generate comprehensive reports.
Created multiple external tables on Unity Catalog to manage data pipeline data stored in S3.

Python (Programming Language)PySparkSQLAzure DatabricksSplunkGitHub+9

Capgemini

Software Engineer

Oct 2021 – Jul 2023 · 1 yr 9 mos · Hyderabad, Telangana, India

Built scalable, maintainable Python, PySpark and SQL code, ensuring software stability.
Investigated and resolved bugs in data pipelines using PySpark transformations.
Optimized 30+ ETL pipelines, reducing processing time by 20% and improving system performance
Designed Databricks workflows to manage data pipeline dependencies and schedule executions efficiently.
Utilized AWS S3 to store billions of records processed through ETL pipelines.
Applied optimization techniques to minimize processing time and reduce costs.

Python (Programming Language)SQLPySparkDatabricksETL/ELTAppian+4