R

Riya Chakraborty

AI Researcher

Bengaluru, Karnataka, India7 yrs 11 mos experience
Most Likely To Switch

Key Highlights

  • 7 years of experience in end-to-end ETL data pipelines.
  • Expert in AWS and data engineering technologies.
  • Proven track record in building scalable data solutions.
Stackforce AI infers this person is a Data Engineer specializing in AWS and data pipeline development for various industries.

Contact

Skills

Core Skills

Data EngineeringAwsData Pipeline DevelopmentEtlApi Development

Other Skills

AWS AthenaAWS GlueAWS RedshiftAWS S3AirflowApache AirflowApache SparkAthenaDatabricks DataFlowDelta LakeEC2EMRExtract, Transform, Load (ETL)FlaskGlue

About

Worked with ETL Data Pipeline end to end for 7 years with Technologies like SQL, Pyspark, Kafka, Airflow, AWS, delta lake, Python, Flask, Superset, data warehousing

Experience

Grab

Senior Data Engineer

Apr 2024Present · 1 yr 11 mos · Bengaluru, Karnataka, India · Hybrid

  • Designed and developed robust data pipelines to facilitate seamless data exchange between Grab and leading digibanks across Malaysia, Singapore, and Indonesia, ensuring compliance with stringent banking regulations.
  • Spearheaded the integration of financial products such as credit lending for Grab drivers, MSMEs, passenger loan products, and marketing analysis, ensuring smooth data flow across various use cases.
  • Optimized data processing workflows using cutting-edge data engineering technologies, including Apache Spark, Presto, AWS S3, Databricks DataFlow, and Delta Lake.
  • Integrated and orchestrated scalable data workflows with Apache Airflow, enhancing efficiency in data processing and management.
  • Managed scalable and cost-effective data storage solutions on AWS S3, ensuring secure and efficient access to critical data assets.
  • Developed impactful dashboards and visualizations using Superset, empowering teams with actionable insights for data-driven decision-making.
  • Implemented Delta Lake architecture to ensure data consistency, reliability, and high-quality data management across systems.
  • Collaborated with cross-functional teams to ensure adherence to regulatory and compliance requirements in the financial sector, maintaining alignment with regional banking standards.
Apache SparkPrestoAWS S3Databricks DataFlowDelta LakeApache Airflow+3

Cyble inc.

Data Engineer III

Jun 2023Mar 2024 · 9 mos · Bengaluru, Karnataka, India · Hybrid

  • Projects -
  • Designed and created on demand Data Pipeline on AWS cloud infrastructure to process terabytes of data related to ransomware using AWS S3, Lambda, EC2, Athena with Python and Pyspark as coding languages
  • Key Responsibilities -
  • Designed and implemented an on-demand data pipeline on AWS cloud infrastructure, optimizing the processing of large volumes of ransomware-related and dark web data.
  • Utilized a range of AWS services including S3, Lambda, Textract, EC2, and Athena to ensure high efficiency and scalability in data processing workflows.
  • Leveraged Python and PySpark for data transformation, processing, and automation, ensuring seamless integration of multiple data sources.
  • Developed and executed strategic plans to manage team tasks, ensuring efficient and timely completion of project objectives.
  • Technologies Used -
  • AWS Athena for data preparation
  • Redshift Spectrum
  • Lambda for pipeline triggering
  • EC2 instance to the run the pipeline
  • S3 as primary storage for data
  • python and pyspark as coding language
  • Superset for dashboard
AWS S3LambdaEC2AthenaPythonPyspark+3

Zoomcar

Data Engineer

Dec 2022Jun 2023 · 6 mos · Bengaluru, Karnataka, India · Hybrid

  • Project:
  • Designed the Hotspot detection system for Zoomcar publicity. Based on GPS data and sensor data, created the pipeline to detect hotspots based on several prime locations.
  • Technologies -
  • SQL
  • Pyspark
  • Airflow
  • AWS Athena, EMR cluster, S3, Redshift, Lambda
  • Delta Lake
SQLPysparkAirflowAWS AthenaEMRS3+5

Inviz ai

Senior Data Engineer

Nov 2021Dec 2022 · 1 yr 1 mo · Bengaluru, Karnataka, India

  • Building intelligent search(inserach) platform for one of the ecommerce client. including better auto suggestions, appropriate product visibilities.
  • Responsibility:
  • create data pipeline for different features like ranking, auto suggestions, personalisation.
  • create API for personalisation feature.
  • Technologies used-
  • AWS technologies like S3, Glue, Athena, Redshift
  • Python as coding language with Pyspark and SQL for computation and data fetching
  • Airflow for orchestration
  • Flask for API building
AWS S3GlueAthenaRedshiftPythonPyspark+4

Amazon

Data Engineer

Apr 2020Oct 2021 · 1 yr 6 mos · Bengaluru, Karnataka, India

  • projects:
  • 1. CPEX, Amazon packaging system, is proper system to package any item shipped without any extra material wasted or over weight issue and maintaining all the amazon guidelines for the same.
  • Key responsibility-
  • Building data pipeline for proper data flow from AWS Redshift to S3 or vice versa using ETL platform, AWS glue, python, SQL
  • 2. Scheduler, internal tool to deschedule any ETL job not needed anymore and reschedule the same if needed after a number of times it failed.
  • Technologies used-
  • AWS Redshift, AWS Glue, Python as programming language, AWS Athena, ETL platform as orchestrator
AWS RedshiftAWS GluePythonAWS AthenaData EngineeringETL

Noodle.ai

3 roles

Data Engineer

Promoted

Oct 2019Apr 2020 · 6 mos

  • Enterprise Data Platform
  • Built an AI platform from scratch, to store the client data in cost efficient way with sharding the data effectively and exposing the data through several ways like API, python package.
  • key responsibilities-
  • Built the Data pipeline for migrating the data from Kafka to different data sources like SQL server, Hive, postgresql and expose the data from different sources to client dashboard using API.
  • technologies used-
  • python, kafka, SQL server, Pyspark, Flask API, Presto
PythonKafkaSQL ServerPysparkFlaskPresto+2

Associate Data Engineer

Jul 2018Oct 2019 · 1 yr 3 mos

  • Data profiler, a tool used by data science team to visualize data with all information like type of data, null value measurements, data distribution using several graphs like histogram, pie chart etc.
  • Technologies-
  • python, pyspark, SQL server, Airflow
PythonPysparkSQL ServerAirflowData Engineering

Data Engineering Internship

Jan 2018Jun 2018 · 5 mos

Pes university

Summer Internship, Bangalore — GUI Development and Version Control(Cultyvate)

Jun 2017Jul 2017 · 1 mo · Bangaon, West Bengal, India

Education

PES University

Master of Computer Applications - MCA — Computer Applications

Jan 2015Jan 2018

Vidyasagar College

B.Sc Hons — Computer Science

Jan 2012Jan 2014

Barasat Kalikrishna Girls' High School

Jan 2003Jan 2011

Stackforce found 100+ more professionals with Data Engineering & Aws

Explore similar profiles based on matching skills and experience