Supriya Ravikumar

Software Engineer

Redmond, Washington, United States11 yrs 5 mos experience

Key Highlights

Expert in building reliable data pipelines at scale.
Strong background in healthcare data engineering and compliance.
Proficient in modern data stack technologies.

Stackforce AI infers this person is a Senior Data Engineer specializing in healthcare data solutions and modern data stack technologies.

Contact

Skills

Core Skills

Data EngineeringData Pipelines

Other Skills

AWSAWS GlueAWS Step FunctionsAirflowAmazon AthenaAmazon RedshiftAmazon S3Amazon Web Services (AWS)Analytical ModellingApache AirflowAutosysAzure DatabricksCI/CDCadence Schematic CaptureChange Data Capture

About

I’m a Senior Data Engineer who loves turning messy, high-stakes, large-scale data into trustworthy, analytics-ready products. My sweet spot is the modern data stack—Snowflake, dbt, Airflow, Databricks, Python/SQL, and AWS—paired with solid data modeling (star/snowflake), testing, and governance. Over the last decade, I’ve built end-to-end data pipelines at scale, processing terabytes of healthcare and enterprise data, including EHR/claims (HL7/X12/837) and API/file ingestions. I design for reliability first: clear contracts, robust dbt tests/Great Expectations, CDC patterns, and CI/CD with Git/Jenkins so changes ship safely and repeatably. I thrive at the intersection of engineering, analytics, and product—translating business needs into durable models and SLAs, optimizing performance/cost in Snowflake, and documenting so teams can move faster. I also enjoy mentoring, code reviews, and raising the bar on data quality. Open to: Senior/Lead Data Engineer roles (Seattle/Remote). [H4 EAD—authorized to work in the U.S.]

Experience

11 yrs 5 mos

Total Experience

2 yrs 10 mos

Average Tenure

Current Experience

Clarivate

senior lead software engineer

Jun 2024 – Feb 2025 · 8 mos

Led end-to-end large-scale healthcare data engineering on Snowflake + dbt + Airflow, building reliable, governed pipelines for EHR/claims data. Delivered analytics-ready datasets at scale that met HIPAA and enterprise quality standards through solid modeling, testing, and CI/CD practices.
Key Impact
➢ Architected batch and near-real-time pipelines on Snowflake, standardizing ingestion of high-volume HL7 and X12 (837) feeds into Bronze/Silver/Gold layers.
➢ Designed large-scale dimensional models and subject-area marts (patients, providers, claims, encounters), powering consistent KPI and regulatory reporting.
➢ Built dbt projects (incremental models, snapshots, seeds) with robust test coverage, lineage, and documentation for enterprise-scale traceability.
➢ Orchestrated workflows in Apache Airflow (scheduling, retries, SLAs), integrating Python utilities for parsing/validation of high-volume SFTP/API data.
➢ Implemented quality gates with Great Expectations/dbt tests; enforced schema evolution, CDC patterns, and scalable governance controls.
➢ Established Git/Jenkins CI/CD for dbt & Airflow deployments, improving release cadence and reliability of large data workloads.
➢ Partnered with analysts and product teams to translate requirements into scalable data contracts and SLAs; optimized Snowflake storage/compute for cost and performance at scale.

SnowflakedbtAirflowEHRHIPAAPython+4

Agilon health

Senior Data Engineer

Sep 2021 – Jun 2024 · 2 yrs 9 mos

As a Senior Data Engineer Responsible for managing the team, actively Involving in analysis of the
technical specifications, designing, and delivering the tasks using Snowflake, SQL, Python working
with EMR/EHR data structures, protocols, and has experience pulling data from and pushing data
back to patient medical records and EHR Systems. Working with EMR/EHR data and well versed with
HL7 standards for healthcare data.
Key impact
➢ Directed architecture and development of healthcare-specific batch processing pipelines in
Snowflake and Python, ingesting EHR/EMR datasets and processing formats including HL7, 837, and
CSV.
➢ Engineered and maintained Apache Airflow DAGs for scheduled batch and event-driven ETL
workflows across clinical and operational data domains.
➢ Built automated SFTP export frameworks in Python to deliver Snowflake datasets to external systems.
Designed and implemented star and snowflake schema data models for accurate, scalable healthcare
KPI reporting.
➢ Applied Great Expectations for automated data validation; integrated Jenkins CI/CD pipelines to
deploy dbt models and Python jobs.
➢ Ensured HIPAA compliance and maintained audit trails for PHI data in Snowflake and downstream
marts.
➢ Collaborated with clinical analysts, operations, and data science teams to define KPIs and deliver
governed, trusted datasets.

SnowflakeSQLPythonEHRHL7Apache Airflow+4

Larsen & toubro infotech

Specialist – Data Engineer

Jan 2021 – Sep 2021 · 8 mos

As a technical Lead deliver development task using Talend Big-data edition, working on AWS (Glue,
Redshift SQL & S3), Python, Co-ordination with onsite lead to understand delivery scope.
Key impact
➢ worked in Data migration from DataStage to talend
➢ Involved in Data Lake project which involves working in multiple source system.
➢ Working in Ingestion method which involved data loading from S3 to Redshift.
➢ Assisted in building the ETL source to Target specification documents by
understanding business requirements
➢ Working on Autosys & Tac in creating jobs to schedule Talend Jobs.
➢ Worked on Flattening the response files and loading into relational data model.

TalendAWSPythonRedshiftS3Data Lake+1

Wipro

Senior Software Engineer

Sep 2013 – Jan 2021 · 7 yrs 4 mos

As a Developer, I am responsible for development, support, maintenance and implementation of
complex components of a project module.
Key impact
➢ Extensive hands-on experience on ETL process consisting of data transformation,
design and development of mapping and workflows.
➢ Migrating the data which is from source (Like SAP, DIAD, OH (Order Hub), SFDC) to
Amazon Redshift though Informatica Cloud. Loading the data from source systems to
redshift. Data sources had to be joined, aggregated, and transformed before running
various data warehousing processes.
➢ Assisted in building the ETL source to Target specification documents by
understanding the business requirements
➢ Experience in Designing, Developing, Documenting, Testing of ETL jobs and mappings
in Server and Parallel jobs using Data Stage to populate tables in Data Warehouse and
Data marts. Server jobs using various types of stages like Sequential file, ODBC,
Hashed file, Aggregator, Transformer, Sort, Link Partitioned and Link Collector.
➢ Designed Parallel jobs using various stages like Join, Merge, Lookup, remove
duplicates, Filter, Dataset, Lookup file set, Complex flat file, Modify, Aggregator, XML.
➢ Developed mappings that perform Extraction, Transformation and load of source data
into Derived Masters schema using various Informatica power center transformations
like Source Qualifier, Aggregator, Filter, Router, Sequence Generator, look up, Rank,
Joiner, Expression, Stored Procedure, SQL, Normalizer and update strategy to meet
business logic in the mappings
➢ Reusable transformations and Mapplets are built wherever redundancy is needed
➢ Performance tuning is performed at the Mapping level as well as the Database level
to increase the data throughput
➢ Designed the Process Control Table that would maintain the status of all the CDC jobs
and thereby drive the load of Derived Master Tables.
➢ Staged Data from legacy system into Oracle 11g Master Tables.
➢ Performed CDC capture registrations