Supriya P — Data Engineer

Data Engineer with over 9 years of extensive experience in designing, implementing, and maintaining data infrastructure solutions. experience in developing, implementing, and optimizing data pipelines on AWS cloud platform. Proficient in AWS data services and passionate about leveraging cloud technologies to drive business insights and decision-making. Adept at collaborating with cross-functional teams to deliver scalable and efficient data pipelines.  Implemented ETL processes on AWS using AWS Glue to extract, transform, and load data from various sources into Amazon S3 and data warehouses like Amazon Redshift.  Developed and maintained data pipelines for real-time data streaming and processing using Amazon Kinesis and AWS Lambda.  Implemented and optimized data processing tasks on Amazon EMR using Apache Spark for distributed computing.  Utilized Python and Scala for scripting and automation tasks, optimizing data processing.  Developed Spark applications to perform large-scale data processing and analysis, leveraging Spark SQL for complex transformations and aggregations.  Designed and optimized data models to ensure efficient storage, retrieval, and querying of data in Snowflake and AWS Redshift.  Implemented data quality checks and monitoring processes to ensure the integrity and reliability of the data pipeline.  Designed and implemented scalable data pipelines using PySpark to process and transform large datasets.  Integrated data from various sources, such as databases, APIs, and file systems, into the data processing pipeline  Scheduled and managed PySpark jobs using tools like Apache Airflow, AWS Glue, or other orchestration frameworks.  Leveraged AWS S3 for data storage and AWS Redshift for scalable data warehousing solutions, optimizing performance and cost-effectiveness.  Implemented end-to-end data pipelines using Informatica Intelligent Cloud Services (IICS) for seamless integration of disparate data sources into Snowflake.  Designed and developed ETL workflows using Informatica PowerCenter to extract, transform, and load data from various on-premises and cloud-based sources.  Designed, implemented, and maintained relational databases such as PostgreSQL, MySQL, or SQL Server to store structured data for enterprise applications.  Developed PL/SQL scripts for data manipulation, transformation, and validation in Oracle databases.

Stackforce AI infers this person is a Fintech Data Engineer specializing in fraud detection and data integration.

Location: Irving, Texas, United States

Experience: 9 yrs 9 mos

Skills

Extract, Transform, Load (etl)
Snowflake
Data Integration
Etl Processes

Career Highlights

Over 9 years of experience in data engineering.
Expert in designing scalable data pipelines on AWS.
Proven track record in fraud detection and data integration.

Work Experience

Mastercard

Data Engineer (2 yrs 4 mos)

BMW Group

Senior Associate (1 yr 5 mos)

Bank of America

Senior Software Engineer (6 yrs)

Education

Bachelor of Technology - BTech at Jawaharlal Nehru Technological University Hyderabad (JNTUH)

Supriya P

Data Engineer

Irving, Texas, United States9 yrs 9 mos experience

Highly Stable

Key Highlights

Over 9 years of experience in data engineering.
Expert in designing scalable data pipelines on AWS.
Proven track record in fraud detection and data integration.

Stackforce AI infers this person is a Fintech Data Engineer specializing in fraud detection and data integration.

Contact

Skills

Core Skills

Extract, Transform, Load (etl)SnowflakeData IntegrationEtl Processes

Other Skills

AWS GlueAWS Step FunctionsAmazon CloudWatchAmazon RedshiftAmazon S3Amazon Web Services (AWS)AngularJSApache AirflowApache SparkAutosysAzure Cosmos DBAzure Data FactoryAzure Data LakeAzure DatabricksAzure DevOps Services

About

Experience

9 yrs 9 mos

Total Experience

3 yrs 8 mos

Average Tenure

2 yrs 4 mos

Current Experience

Mastercard

Data Engineer

Feb 2024 – Present · 2 yrs 4 mos · Austin, Texas, United States · Remote

Designed and implemented fraud-focused ETL pipelines using IBM DataStage, extracting suspicious transactions and loading them into Snowflake.
Developed complex SQL queries and Snowflake transformations to enrich fraud datasets, improving fraud detection model accuracy by 25%.
Built automated data validation and reconciliation scripts in Python to ensure accuracy and completeness of fraud monitoring data.
Partnered with Fraud Analytics & Risk teams to translate detection rules into technical workflows (velocity checks, anomaly detection).
Contributed to legacy system migration by moving fraud and disputes data from Oracle to Snowflake, ensuring schema alignment and performance tuning.
Implemented governance and access controls in Snowflake (RBAC, masking policies, encryption) to meet compliance standards.
Designed real-time fraud monitoring pipelines using Kafka + Snowpipe, enabling sub-minute latency for suspicious transaction alerts.
Documented workflows, ETL designs, and data lineage for compliance audits and operational efficiency.

GitSQLRDBMSExtract, Transform, Load (ETL)PySparkPython (Programming Language)+5

Bmw group

Senior Associate

Feb 2022 – Jul 2023 · 1 yr 5 mos · Hyderabad, Telangana, India · Remote

Developed Snowflake data models and ELT pipelines to integrate fraud and dispute datasets from AWS S3, RDBMS, and APIs.
Built schema-evolved data lakes using AWS Glue + S3 to handle structured/unstructured fraud data, including transactions and claims.
Applied ANSI SQL and Python-based fraud data checks to detect anomalies such as duplicate claims, transaction spikes, and abnormal patterns.
Integrated dispute case management data into Snowflake for downstream fraud risk analysis and regulatory reporting.
Automated fraud data ingestion workflows with Airflow and Step Functions, reducing manual intervention by 40%.
Implemented row-level security and masking for PII, ensuring GDPR and SOX compliance.
Optimized Snowflake queries and clustering strategies, reducing fraud query response times from hours to minutes.
Collaborated with risk teams to design fraud dashboards in Tableau and Power BI, providing real-time visibility into fraudulent activity.

SnowflakePython (Programming Language)PySparkSQLAmazon Web Services (AWS)Azure Data Factory+1

Bank of america

Senior Software Engineer

Jan 2016 – Jan 2022 · 6 yrs · Hyderabad, Telangana, India · On-site

Built and maintained DataStage ETL pipelines to integrate fraud, risk, and financial crimes datasets into Snowflake and Oracle.
Designed fraud data marts in Snowflake using Star/Snowflake schemas, enabling fast reporting and anomaly detection queries.
Developed fraud detection rules in SQL and Python (velocity checks, transaction thresholds, geo-location mismatches).
Partnered with Fraud Technology teams to enable machine learning workflows for fraud scoring and anomaly detection.
Contributed to data migration initiatives, moving fraud monitoring systems from on-prem Oracle to Snowflake.
Implemented real-time fraud streaming pipelines using AWS Kinesis and Kafka, feeding alerts into Snowflake for near real-time risk assessment.
Applied RBAC, encryption, and auditing in Snowflake and DataStage to maintain compliance with financial regulations.
Documented data flows, ETL processes, and fraud detection logic for compliance audits and internal governance.