G

Gayathri S.

Data Engineer

Albany, New York, United States3 yrs 6 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Built scalable data pipelines improving data availability by 35%.
  • Automated data ingestion achieving 99.9% pipeline reliability.
  • Developed dimensional models enhancing query performance by 50%.
Stackforce AI infers this person is a Data Engineer specializing in Insurance and Public Sector data solutions.

Contact

Skills

Core Skills

Data EngineeringCloud ComputingData Analysis

Other Skills

Apache SparkKafkaAWS GlueLambdaStep FunctionsDelta LakeSQLPySparkGitHub ActionsTerraformAWS S3PythonSnowflakedbtHive

About

’m a Data Engineer with 3+ years of experience building reliable, scalable data pipelines across insurance, public sector, and ad-tech domains. I specialize in turning raw, messy data into analytics-ready datasets that business and analytics teams can trust. Currently, I work on batch and streaming pipelines using Spark, Kafka, and AWS (S3, Glue, Lambda, Step Functions), supporting actuarial, risk, and operations teams with high-quality data at scale. I’ve built and maintained data lakes with Delta Lake, developed dimensional models in Snowflake, and improved query performance and pipeline reliability through optimization and automation. Previously, I worked as a Data Analyst in public sector and ad-tech environments, where I designed ELT workflows with dbt, optimized large SQL workloads, and implemented data validation frameworks to support reporting, compliance, and decision-making. I’ve handled multi-terabyte datasets, integrated data from APIs and SFTP sources, and reduced manual reporting through automation with Airflow and Python. What I bring: • Strong hands-on experience with Python, SQL, Spark, Kafka, and AWS • Solid understanding of data modeling, ETL/ELT, and analytics engineering • Focus on data quality, reliability, and performance • Ability to translate business requirements into scalable data solutions I enjoy working at the intersection of engineering and analytics — building systems that make data accessible, accurate, and useful. I’m always open to opportunities where I can grow as a data engineer and work on impactful data products. 📫 Feel free to connect if you’d like to talk about data engineering, analytics, or cloud data platforms. 📞+1 (802)531-0104

Experience

Metlife

Data Analyst/ Analytics Engineer

Aug 2025Present · 7 mos

  • Designed and maintained batch and streaming data pipelines using Apache Spark, Kafka, AWS Glue, Lambda, and Step Functions, improving data availability for actuarial and risk teams by 35%.
  • Built and managed an S3-based data lake with Delta Lake tables to centralize structured and semi-structured insurance data, cutting reconciliation efforts by 40%.
  • Automated ingestion from 15+ internal source systems, achieving 99.9% pipeline reliability for BI and analytics consumers.
  • Optimized Spark SQL and PySpark jobs through partitioning and execution plan tuning, reducing runtimes by 30–45% and accelerating dashboard refresh cycles.
  • Developed dimensional data models in Snowflake for policy and claims analytics, improving ad-hoc query performance by 50% for finance and operations teams.
  • Implemented data quality checks using Great Expectations, reducing downstream data issues by 25% and supporting compliance initiatives.
  • Supported CI/CD for data pipelines using GitHub Actions and Terraform, ensuring consistent deployments across environments.
Apache SparkKafkaAWS GlueLambdaStep FunctionsDelta Lake+6

New york state division of criminal justice services

Data Analyst

Sep 2024May 2025 · 8 mos

  • Built end-to-end analytical pipelines using SQL, Python, Snowflake, and dbt to deliver curated datasets for justice and public safety reporting.
  • Developed modular ELT workflows with dbt, reducing transformation duplication by 35% across arrest, court, and corrections data.
  • Optimized complex SQL queries on multi-terabyte datasets, improving dashboard refresh times by 40% for executive and operational users.
  • Designed incremental and historical data models to support monthly, quarterly, and annual statutory reporting with consistent metrics for audits.
  • Integrated data from APIs and secure SFTP feeds, eliminating 20+ hours/month of manual data preparation.
  • Implemented Python- and SQL-based validation checks, reducing reporting discrepancies by 30% and improving audit readiness.
  • Documented datasets and business definitions to enable self-service analytics and reduce ad-hoc support requests.
SQLPythonSnowflakedbtData AnalysisData Engineering

Tata consultancy services

Data Analyst

May 2021Aug 2023 · 2 yrs 3 mos

  • Analyzed large-scale mobile advertising data using SQL, Hive, and Python, generating insights from 100M+ daily ad impressions.
  • Built automated ETL pipelines with Hadoop and Apache Spark, reducing reporting latency by 45% for sales and marketing teams.
  • Designed fact and dimension tables to standardize key ad-tech metrics (CTR, CPA, ROAS) across analytics teams.
  • Tuned Spark jobs by optimizing executor settings and partition strategies, lowering compute costs by 20%.
  • Developed Python-based anomaly detection scripts to identify traffic and revenue spikes, enabling faster investigations by fraud and ad quality teams.
  • Partnered with product and business stakeholders to convert requirements into SQL-driven dashboards for campaign optimization.
  • Automated recurring reports using Airflow DAGs, saving 15 hours per week of manual reporting effort.
SQLHivePythonHadoopApache SparkData Analysis+1

Education

University at Albany

Masters in Science — Data Science

Aug 2023May 2025

Stackforce found 100+ more professionals with Data Engineering & Cloud Computing

Explore similar profiles based on matching skills and experience