Varun Kumar N

Data Engineer

St Louis, Missouri, United States6 yrs experience

AI EnabledAI ML Practitioner

Key Highlights

3+ years of experience in data analytics and automation.
Expert in building ETL pipelines and real-time dashboards.
Proficient in cloud-native solutions with Azure and AWS.

Stackforce AI infers this person is a Data Engineer specializing in cloud-based solutions and data analytics.

Contact

Skills

Core Skills

Data EngineeringCloud ArchitectureData VisualizationEtl Pipelines

Other Skills

AWSAdaptabilityAlgorithmsAmazon MSKAnalytical SkillsArtificial Intelligence for BusinessArtificial Intelligence for DesignAzure Data FactoryAzure DatabricksBackup & Recovery SystemsBigQueryBusiness AnalysisBusiness RequirementsCCI/CD

About

I’m a Cloud Data Analyst passionate about transforming raw data into impactful business insights through scalable, cloud-native solutions. I bring 3+ years of experience in data analytics and automation, from building ETL pipelines with Python and AWS to delivering real-time dashboards using Power BI and Fabric Lakehouse. I specialize in end-to-end data pipeline design across Azure Data Factory, Snowflake, and Microsoft Fabric to drive smart, automated decision-making. In my most recent role at CDW, I deployed Fabric-powered prototypes integrated with Azure SQL and Power BI Copilot to modernize audit reporting. I also built ETL workflows in ADF that cut data refresh delays by 40%, enhancing accuracy and accessibility. Currently seeking a full-time opportunity in a product-based company where I can solve complex data challenges and build cloud-based, future-ready solutions with Microsoft Fabric, Azure, Python, and SQL. Technical Strengths: Microsoft Fabric | Azure Data Factory | Power BI | Python | SQL | Snowflake | AWS Lambda | Databricks | Alteryx | Git | ETL Pipelines | Cloud Architecture Let’s connect if you're looking for someone who can bring both innovation and execution to your data team.

Experience

6 yrs

Total Experience

1 yr 7 mos

Average Tenure

1 yr 10 mos

Current Experience

Cdw

AWS Data Engineer

Jun 2024 – Present · 1 yr 10 mos · Hybrid

Designed and optimized ETL pipelines using Python, PySpark, SQL, and Azure Data Factory, automating ingestion of SaaS, logistics, and financial data.
Designed and deployed scalable data pipelines using AWS services such as EMR, S3, Athena, and Lambda.
Implemented real-time streaming solutions with Amazon MSK (Kafka) and integrated data into downstream analytics platforms.
Built and maintained data models and reporting datasets supporting dashboards used by 500+ users.
Automated cloud compute, network, and storage workflows using Docker and Kubernetes, reducing deployment time by 40%.
Managed Linux/Unix applications and cron jobs for nightly batch and streaming data processing.
Integrated REST APIs with Git-based CI/CD pipelines (GitLab, Jenkins, Azure DevOps) for consistent data refresh.
Optimized SQL queries, PL/SQL scripts, and indexing strategies, achieving 25% faster analytics performance.
Developed data validation and reconciliation frameworks to ensure accuracy and completeness of ETL pipelines.
Implemented logging, monitoring, and error-handling frameworks for pipelines, improving uptime by 20%.
Partnered with cross-functional teams to deliver predictive analytics and machine learning-ready data solutions.
Mentored junior engineers on Python, PySpark, and data pipeline best practices.

PythonPySparkSQLAzure Data FactoryAWSDocker+6

Excelerate

2 roles

Data Visualization Associate

Jan 2024 – Feb 2024 · 1 mo · Remote

As a Data Visualization Associate, I developed a project plan for a global event, which strengthened my skills in team management, documentation, and crisis management. My key tasks included creating visual tools such as RACI matrices, team and project charters, and presenting innovative data concepts during meetings to encourage collaboration within a diverse international team. I also maintained effective communication through the internship platform and fulfilled all tasks assigned by the Project Head.

Data VisualizationMicrosoft ExcelData ValidationProblem SolvingData AnalysisPython+2

Project Manager

May 2023 – Jun 2023 · 1 mo · Remote

A Project Manager must excel in leadership, communication, project planning, risk management, and stakeholder management while effectively managing time, budget, and resources. They should be adaptable, detail-oriented, and proficient in various project management tools and methodologies to ensure successful project outcomes.

PythonSQLShell ScriptingGCPBigQueryData Engineering

Sodexo

Student Supervisor

Jan 2024 – Dec 2024 · 11 mos · St Louis, Missouri, United States · On-site

Student Welfare

Ntt data services

Azure Data Engineer

Aug 2021 – Aug 2023 · 2 yrs · Bengaluru, Karnataka, India · Remote

Developed and deployed data pipelines on Azure Data Factory and Azure Databricks to process large-scale datasets.
Built and maintained ETL pipelines using Python, SQL, and PySpark for enterprise analytics solutions.
Containerized 15+ applications using Docker & Kubernetes, automating hybrid cloud ETL deployments.
Migrated 10TB+ datasets from legacy Oracle systems into cloud-based PostgreSQL and MongoDB environments.
Designed data pipelines for batch and streaming data, supporting analytics and BI reporting.
Developed orchestration workflows using cron jobs and Python automation scripts.
Configured and optimized Oracle PL/SQL queries and stored procedures for high-volume data processing.
Built data lineage, dependency tracking, and metadata management frameworks to improve governance.
Mentored junior developers on data modeling, Python, SQL, and ETL design patterns.
Implemented monitoring, logging, and alerting with Azure Monitor and Application Insights, improving system reliability.

Azure Data FactoryAzure DatabricksPythonSQLDockerKubernetes+3

Tp

GCP Data Engineer

Oct 2019 – Jul 2021 · 1 yr 9 mos · Hybrid

Built and maintained reporting and ETL pipelines using Python, SQL, and shell scripting, reducing processing time by 30%.
Designed and deployed cloud-based ETL pipelines on GCP using BigQuery and Cloud Storage.
Automated data extraction, transformation, and loading for operational dashboards.
Prepared BI reports from large relational datasets to support data-driven decisions.
Developed Python scripts for batch processing and data cleansing, improving analytics quality.
Scheduled recurring ETL jobs with cron on Linux/Unix servers.
Optimized SQL Server and Oracle queries for performance.
Documented ETL workflows, database schemas, and automated reporting processes.
Implemented GCP monitoring and logging solutions with Stackdriver to ensure data pipeline reliability.

PythonSQLETL pipelinesData processing