Shashank Puli — Data Engineer

I am a Data Engineer with 4+ years of experience building and maintaining data pipelines across cloud and on-premise platforms. I design ETL workflows to process large datasets and deliver actionable insights for business decisions. Skills: Python | SQL | Java | Shell Scripting | AWS | GCP | Azure | Spark | PySpark | Airflow | Snowflake | HBase | Hadoop | Kafka | ETL Pipelines | Power BI | Tableau | Data Integration | Real-Time Workflows At Crowe LLP, I built ETL pipelines on AWS using Lambda, S3, and Python to process financial datasets. I orchestrated real-time workflows with Kubernetes and Docker, cutting processing latency by 30%. I managed IAM roles and CloudWatch triggers to automate scripts, optimized HBase and Redshift storage, and ran Apache Spark jobs to speed up batch processing by 40%. At Orange S.A., I developed ETL pipelines in BigQuery and Dataflow for telecom and customer data. I created Airflow DAGs to manage workflows and used Cloud Monitoring to reduce data discrepancies by 40%. I designed and optimized Snowflake warehouses, improved SQL queries, and automated GCP resource provisioning with Terraform and Python, cutting deployment time in half. At Carborundum Universal, I improved ETL pipelines to process manufacturing and supply chain data using Python, SQL, and Airflow. I built real-time Power BI dashboards to monitor production KPIs, cutting reporting time by 40%. I integrated workflows with Spark and Hadoop, migrated legacy data to Snowflake for better access and query speed, and implemented data validation to ensure accuracy. Contact Information: Email: pulishashank1@gmail.com Mobile: +1 (484) 487-1200 Location: United States

Stackforce AI infers this person is a Data Engineer with expertise in Fintech and Telecom data solutions.

Experience: 5 yrs 6 mos

Skills

Etl Pipeline Development
Cloud Computing
Data Visualization
Full-stack Development

Career Highlights

Expert in building scalable ETL pipelines across cloud platforms.
Proven track record of optimizing data workflows and reducing latency.
Strong background in data visualization and real-time analytics.

Work Experience

Crowe MacKay LLP

GP InfoTech

Carborundum Universal Limited

Nalluri Associates

Education

Masters of Science at University of South Florida

Bachelor of technology in computer science at GITAM Deemed University

Shashank Puli

Data Engineer

5 yrs 6 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Expert in building scalable ETL pipelines across cloud platforms.
Proven track record of optimizing data workflows and reducing latency.
Strong background in data visualization and real-time analytics.

Stackforce AI infers this person is a Data Engineer with expertise in Fintech and Telecom data solutions.

Contact

pulishashank1@gmail.com LinkedIn

Skills

Core Skills

Etl Pipeline DevelopmentCloud ComputingData VisualizationFull-stack Development

Other Skills

AWSAWS LambdaAirflowAmazon RedshiftAmazon S3Amazon S3 & Lake FormationAmazon Web Services (AWS)Analytical SkillsAnomaly DetectionApache AirflowApache SparkArtificial Intelligence (AI)Augmented Reality (AR)Batch ProcessingBig Data Processing

About

Experience

5 yrs 6 mos

Total Experience

1 yr 4 mos

Average Tenure

Current Experience

Crowe mackay llp

Data Analytics Engineer | AWS | PySpark | ETL | Spark | Kubernetes | Docker | Big Data | HBase

Aug 2024 – Sep 2025 · 1 yr 1 mo · Tampa, Florida, United States · Remote

1. Architected and deployed scalable ETL pipelines on AWS using Lambda, S3, and Python, accelerating ingestion and transformation of large financial datasets across multiple client projects.
2. Orchestrated real-time data integration workflows with Kubernetes and Docker, reducing processing latency by 30% while improving efficiency and reliability of client reporting streams.
3. Configured IAM roles and CloudWatch triggers to automate Python scripts, strengthening system security, enhancing operational uptime, and ensuring compliance with internal policies.
4. Optimized HBase and Redshift storage structures through partitioning and indexing, enhancing query performance and reducing data retrieval runtime for financial reporting.
5. Executed Apache Spark jobs on AWS EMR for large-scale batch processing, boosting processing speed by 40% and enabling faster turnaround of financial analytics.
6. Implemented pipeline monitoring and error handling using CloudWatch and Python, identifying failures and maintaining high reliability across all data workflows.
7. Collaborated with cross-functional teams to design efficient data models and streamline database schemas, improving analytics performance and operational insights for clients.
8. Integrated multiple client data sources into unified ETL pipelines, ensuring seamless data consolidation, analytics, and reporting for diverse financial datasets.
9. Validated and audited large-scale datasets using Python and SQL, maintaining high standards of accuracy, consistency, and compliance across client deliverables.
10. Documented pipeline architecture, workflows, and operational procedures, enabling knowledge transfer, team onboarding, and ongoing optimization of ETL processes.

AWSPythonETLKubernetesDockerHBase+4

Gp infotech

Data Engineer | BigQuery | Dataflow | Airflow | Snowflake | Python | SQL

Mar 2021 – Jul 2023 · 2 yrs 4 mos · Hyderabad, Telangana, India · Hybrid

1. Architected and deployed highly scalable ETL pipelines in BigQuery and Dataflow, processing and transforming large volumes of telecom and customer datasets for advanced analytics and reporting.
2. Orchestrated Airflow DAGs to automate complex GCP workflows, ensuring seamless integration with Cloud Storage and Pub/Sub while maintaining consistent pipeline reliability.
3. Executed comprehensive data quality checks using Cloud Monitoring and Stackdriver, identifying and resolving discrepancies to reduce errors in customer analytics reports by 40%.
4. Designed, optimized, and maintained Snowflake data warehouses, enhancing SQL query efficiency, implementing clustering strategies, and accelerating analytics for customer behavior and network performance.
5. Automated GCP resource provisioning using Terraform and Python, reducing infrastructure deployment time by 50% while enforcing compliance with internal security and operational standards.
6. Integrated multiple telecom and customer datasets into unified pipelines, enabling consolidated analytics, improved reporting, and actionable insights for network performance and customer trends.
7. Implemented rigorous data validation and anomaly detection processes, ensuring high accuracy, consistency, and reliability across all ingested and transformed datasets.
8. Collaborated with data scientists and analysts to develop analytics-ready data models, enhancing insights into customer behavior patterns and network performance metrics.
9. Monitored pipeline performance and resolved errors using Cloud Monitoring, maintaining high system uptime, operational stability, and workflow efficiency across all projects.
10. Documented end-to-end data workflows, pipeline architecture, and operational procedures, supporting team knowledge transfer, onboarding, and continuous process improvement.

BigQueryDataflowAirflowSnowflakePythonSQL+2

Carborundum universal limited

Data Analyst | Spark Developer | ETL | Airflow | Snowflake | Power BI | SQL | Python

Mar 2020 – Feb 2021 · 11 mos · Greater Chennai Area · Remote

1. Engineered and optimized ETL pipelines to ingest, clean, and transform large-scale manufacturing and supply chain data using Python, SQL, and Airflow, improving processing efficiency.
2. Developed and maintained real-time dashboards in Power BI to monitor production KPIs, reducing reporting latency by 40% and accelerating data-driven management decisions.
3. Implemented data integration workflows using Spark and Hadoop, consolidating operational data across multiple facilities and enabling seamless analytics and reporting.
4. Migrated legacy datasets to Snowflake, enhancing data accessibility, optimizing query performance, and accelerating insights for analytics and operations teams.
5. Executed data validation and anomaly detection processes using Python scripts and SQL, ensuring high accuracy, consistency, and reliability of production and inventory data.
6. Automated recurring ETL tasks and data workflows, reducing manual intervention, improving consistency, and increasing overall supply chain data processing efficiency.
7. Collaborated with cross-functional teams to design and implement optimized data models, supporting advanced analytics initiatives and operational reporting requirements.
8. Monitored ETL pipeline performance, identified bottlenecks, and resolved issues, maintaining uninterrupted data flow and high operational reliability.
9. Documented end-to-end ETL processes, dashboards, and workflows, facilitating team knowledge transfer, onboarding, and continuous improvement of data operations.
10. Integrated additional manufacturing and supply chain data sources into existing pipelines, expanding analytics coverage and delivering actionable insights to improve operations.

PythonSQLAirflowPower BISnowflakeETL Pipeline Development+1

Nalluri associates

Full-Stack Developer | Python | JavaScript | HTML | CSS | React | Node.js | MySQL | Flask

Jan 2019 – Mar 2020 · 1 yr 2 mos · Hyderabad, Telangana, India · Remote

1.Developed and deployed a fully functional company website, handling both frontend (HTML, CSS, JavaScript) and backend (Python, MySQL, Flask) to enhance the firm’s online presence and customer engagement.
2.Designed and implemented lightweight internal tools using Excel automation and simple database systems to streamline client and loan data management.
3.Collaborated directly with leadership to understand business needs and deliver custom technology solutions that optimized daily financial operations.
4.Improved workflow efficiency by digitizing manual loan tracking processes, reducing administrative effort and improving data accuracy.
5.Gained hands-on experience in end-to-end full-stack development, database integration, and business-oriented software solutions in a real-world financial services environment.

PythonJavaScriptHTMLCSSMySQLFull-Stack Development