Shashank Puli

Data Engineer

5 yrs 6 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in building scalable ETL pipelines across cloud platforms.
  • Proven track record of optimizing data workflows and reducing latency.
  • Strong background in data visualization and real-time analytics.
Stackforce AI infers this person is a Data Engineer with expertise in Fintech and Telecom data solutions.

Contact

Skills

Core Skills

Etl Pipeline DevelopmentCloud ComputingData VisualizationFull-stack Development

Other Skills

AWSAWS LambdaAirflowAmazon RedshiftAmazon S3Amazon S3 & Lake FormationAmazon Web Services (AWS)Analytical SkillsAnomaly DetectionApache AirflowApache SparkArtificial Intelligence (AI)Augmented Reality (AR)Batch ProcessingBig Data Processing

About

I am a Data Engineer with 4+ years of experience building and maintaining data pipelines across cloud and on-premise platforms. I design ETL workflows to process large datasets and deliver actionable insights for business decisions. Skills: Python | SQL | Java | Shell Scripting | AWS | GCP | Azure | Spark | PySpark | Airflow | Snowflake | HBase | Hadoop | Kafka | ETL Pipelines | Power BI | Tableau | Data Integration | Real-Time Workflows At Crowe LLP, I built ETL pipelines on AWS using Lambda, S3, and Python to process financial datasets. I orchestrated real-time workflows with Kubernetes and Docker, cutting processing latency by 30%. I managed IAM roles and CloudWatch triggers to automate scripts, optimized HBase and Redshift storage, and ran Apache Spark jobs to speed up batch processing by 40%. At Orange S.A., I developed ETL pipelines in BigQuery and Dataflow for telecom and customer data. I created Airflow DAGs to manage workflows and used Cloud Monitoring to reduce data discrepancies by 40%. I designed and optimized Snowflake warehouses, improved SQL queries, and automated GCP resource provisioning with Terraform and Python, cutting deployment time in half. At Carborundum Universal, I improved ETL pipelines to process manufacturing and supply chain data using Python, SQL, and Airflow. I built real-time Power BI dashboards to monitor production KPIs, cutting reporting time by 40%. I integrated workflows with Spark and Hadoop, migrated legacy data to Snowflake for better access and query speed, and implemented data validation to ensure accuracy. Contact Information: Email: pulishashank1@gmail.com Mobile: +1 (484) 487-1200 Location: United States

Experience

5 yrs 6 mos
Total Experience
1 yr 4 mos
Average Tenure
--
Current Experience

Crowe mackay llp

Data Analytics Engineer | AWS | PySpark | ETL | Spark | Kubernetes | Docker | Big Data | HBase

Aug 2024Sep 2025 · 1 yr 1 mo · Tampa, Florida, United States · Remote

  • 1. Architected and deployed scalable ETL pipelines on AWS using Lambda, S3, and Python, accelerating ingestion and transformation of large financial datasets across multiple client projects.
  • 2. Orchestrated real-time data integration workflows with Kubernetes and Docker, reducing processing latency by 30% while improving efficiency and reliability of client reporting streams.
  • 3. Configured IAM roles and CloudWatch triggers to automate Python scripts, strengthening system security, enhancing operational uptime, and ensuring compliance with internal policies.
  • 4. Optimized HBase and Redshift storage structures through partitioning and indexing, enhancing query performance and reducing data retrieval runtime for financial reporting.
  • 5. Executed Apache Spark jobs on AWS EMR for large-scale batch processing, boosting processing speed by 40% and enabling faster turnaround of financial analytics.
  • 6. Implemented pipeline monitoring and error handling using CloudWatch and Python, identifying failures and maintaining high reliability across all data workflows.
  • 7. Collaborated with cross-functional teams to design efficient data models and streamline database schemas, improving analytics performance and operational insights for clients.
  • 8. Integrated multiple client data sources into unified ETL pipelines, ensuring seamless data consolidation, analytics, and reporting for diverse financial datasets.
  • 9. Validated and audited large-scale datasets using Python and SQL, maintaining high standards of accuracy, consistency, and compliance across client deliverables.
  • 10. Documented pipeline architecture, workflows, and operational procedures, enabling knowledge transfer, team onboarding, and ongoing optimization of ETL processes.
AWSPythonETLKubernetesDockerHBase+4

Gp infotech

Data Engineer | BigQuery | Dataflow | Airflow | Snowflake | Python | SQL

Mar 2021Jul 2023 · 2 yrs 4 mos · Hyderabad, Telangana, India · Hybrid

  • 1. Architected and deployed highly scalable ETL pipelines in BigQuery and Dataflow, processing and transforming large volumes of telecom and customer datasets for advanced analytics and reporting.
  • 2. Orchestrated Airflow DAGs to automate complex GCP workflows, ensuring seamless integration with Cloud Storage and Pub/Sub while maintaining consistent pipeline reliability.
  • 3. Executed comprehensive data quality checks using Cloud Monitoring and Stackdriver, identifying and resolving discrepancies to reduce errors in customer analytics reports by 40%.
  • 4. Designed, optimized, and maintained Snowflake data warehouses, enhancing SQL query efficiency, implementing clustering strategies, and accelerating analytics for customer behavior and network performance.
  • 5. Automated GCP resource provisioning using Terraform and Python, reducing infrastructure deployment time by 50% while enforcing compliance with internal security and operational standards.
  • 6. Integrated multiple telecom and customer datasets into unified pipelines, enabling consolidated analytics, improved reporting, and actionable insights for network performance and customer trends.
  • 7. Implemented rigorous data validation and anomaly detection processes, ensuring high accuracy, consistency, and reliability across all ingested and transformed datasets.
  • 8. Collaborated with data scientists and analysts to develop analytics-ready data models, enhancing insights into customer behavior patterns and network performance metrics.
  • 9. Monitored pipeline performance and resolved errors using Cloud Monitoring, maintaining high system uptime, operational stability, and workflow efficiency across all projects.
  • 10. Documented end-to-end data workflows, pipeline architecture, and operational procedures, supporting team knowledge transfer, onboarding, and continuous process improvement.
BigQueryDataflowAirflowSnowflakePythonSQL+2

Carborundum universal limited

Data Analyst | Spark Developer | ETL | Airflow | Snowflake | Power BI | SQL | Python

Mar 2020Feb 2021 · 11 mos · Greater Chennai Area · Remote

  • 1. Engineered and optimized ETL pipelines to ingest, clean, and transform large-scale manufacturing and supply chain data using Python, SQL, and Airflow, improving processing efficiency.
  • 2. Developed and maintained real-time dashboards in Power BI to monitor production KPIs, reducing reporting latency by 40% and accelerating data-driven management decisions.
  • 3. Implemented data integration workflows using Spark and Hadoop, consolidating operational data across multiple facilities and enabling seamless analytics and reporting.
  • 4. Migrated legacy datasets to Snowflake, enhancing data accessibility, optimizing query performance, and accelerating insights for analytics and operations teams.
  • 5. Executed data validation and anomaly detection processes using Python scripts and SQL, ensuring high accuracy, consistency, and reliability of production and inventory data.
  • 6. Automated recurring ETL tasks and data workflows, reducing manual intervention, improving consistency, and increasing overall supply chain data processing efficiency.
  • 7. Collaborated with cross-functional teams to design and implement optimized data models, supporting advanced analytics initiatives and operational reporting requirements.
  • 8. Monitored ETL pipeline performance, identified bottlenecks, and resolved issues, maintaining uninterrupted data flow and high operational reliability.
  • 9. Documented end-to-end ETL processes, dashboards, and workflows, facilitating team knowledge transfer, onboarding, and continuous improvement of data operations.
  • 10. Integrated additional manufacturing and supply chain data sources into existing pipelines, expanding analytics coverage and delivering actionable insights to improve operations.
PythonSQLAirflowPower BISnowflakeETL Pipeline Development+1

Nalluri associates

Full-Stack Developer | Python | JavaScript | HTML | CSS | React | Node.js | MySQL | Flask

Jan 2019Mar 2020 · 1 yr 2 mos · Hyderabad, Telangana, India · Remote

  • 1.Developed and deployed a fully functional company website, handling both frontend (HTML, CSS, JavaScript) and backend (Python, MySQL, Flask) to enhance the firm’s online presence and customer engagement.
  • 2.Designed and implemented lightweight internal tools using Excel automation and simple database systems to streamline client and loan data management.
  • 3.Collaborated directly with leadership to understand business needs and deliver custom technology solutions that optimized daily financial operations.
  • 4.Improved workflow efficiency by digitizing manual loan tracking processes, reducing administrative effort and improving data accuracy.
  • 5.Gained hands-on experience in end-to-end full-stack development, database integration, and business-oriented software solutions in a real-world financial services environment.
PythonJavaScriptHTMLCSSMySQLFull-Stack Development

Education

University of South Florida

Masters of Science — Computer Engineering

GITAM Deemed University

Bachelor of technology in computer science

Stackforce found 100+ more professionals with Etl Pipeline Development & Cloud Computing

Explore similar profiles based on matching skills and experience