Esha Aishwarya — Operations Associate

As a Data Engineer at Genpact, I specialize in building scalable, optimized, and automated data solutions that enhance efficiency and insights. Starting as a One Data intern and moving into a full-time role, I’ve developed expertise in cloud data engineering, pipeline design, and AI-driven automation. Currently, I work on a multi-client invoice processing product using Azure Databricks, SQL, SFTP, and Databricks Workflows. I design and optimize pipelines to automate invoice data ingestion, transformation, and validation from multiple sources, ensuring accuracy, traceability, and consistency. The solution follows a Medallion architecture (Bronze–Silver–Gold), with curated SQL views above the Gold layer powering Power BI dashboards for client and financial reporting. I also create Python automation scripts for workflow orchestration, error handling, and monitoring, improving reliability and reducing manual effort. Earlier, I led a data migration project on Databricks, converting complex PostgreSQL pipelines into Spark SQL, achieving up to 60% performance gains through distributed processing and query optimization. I handled code translation, validation, and benchmarking to ensure seamless migration and reliable delivery. The data model followed the Bronze–Silver–Gold pattern, with Gold-layer views integrated into Tableau for real-time business reporting and insights. During my internship, I contributed to a Hugging Face project, implementing data lineage tracing and model metadata extraction to improve ML model transparency and governance. This enhanced model usability and collaboration across teams. I’ve also done web scraping using BeautifulSoup and Selenium, extracting insights from Amazon, Flipkart, Nykaa, and YouTube for analytics. Additionally, I built AI-based tools using OpenAI’s GPT-3.5 to automate PDF invoice and Excel reconciliation, improving accuracy and efficiency. I’m skilled in AWS (S3, Glue, Athena) for ETL workflows and proficient in Excel for analysis and visualization. My strengths lie in data integration, performance tuning, and automation. I’m passionate about Generative AI, NLP, and Machine Learning, continually exploring them through hands-on projects and certifications. Tech Stack: Spark SQL | Azure Databricks | PostgreSQL | PySpark | SQL | Python | Power BI | Tableau | AWS (S3, Glue, Athena) | SFTP | Databricks Workflows | Pandas | NumPy | Scikit-learn | BeautifulSoup | Selenium | Hugging Face | Git | Excel | OpenAI API

Stackforce AI infers this person is a Data Engineer specializing in SaaS and AI/ML solutions.

Location: Bengaluru, Karnataka, India

Experience: 1 yr 9 mos

Skills

Data Engineering
Apache Spark
Performance Optimization
Web Scraping
Data Governance
Machine Learning
Data Analysis
Sql

Career Highlights

Achieved 60% performance gains in data migration projects.
Expert in building automated data solutions using Azure Databricks.
Passionate about Generative AI and Machine Learning.

Work Experience

Wells Fargo

Data Management Associate (2 mos)

Genpact

Data Engineer (1 yr 7 mos)

One Data Intern (5 mos)

EATCLUB Brands (Formerly BOX8)

Data analyst intern (2 mos)

Internshala

Internshala Student Partner (1 mo)

Education

Bachelor of Technology - BTech at Vellore Institute of Technology

Esha Aishwarya

Operations Associate

Bengaluru, Karnataka, India1 yr 9 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Achieved 60% performance gains in data migration projects.
Expert in building automated data solutions using Azure Databricks.
Passionate about Generative AI and Machine Learning.

Stackforce AI infers this person is a Data Engineer specializing in SaaS and AI/ML solutions.

Contact

Skills

Core Skills

Data EngineeringApache SparkPerformance OptimizationWeb ScrapingData GovernanceMachine LearningData AnalysisSql

Other Skills

DatabricksPySparkPostgreSQLSpark SQLData MigrationSeleniumGoogle Cloud Platform (GCP)Data LineageModel Metadata ExtractionBeautifulSoupData ExtractionSoftware Development Life Cycle (SDLC)Financial Risk ManagementGreenplumRaspberry Pi

About

Experience

1 yr 9 mos

Total Experience

1 yr 6 mos

Average Tenure

2 mos

Current Experience

Wells fargo

Data Management Associate

Mar 2026 – Present · 2 mos · Hyderabad, Telangana, India · Hybrid

Genpact

2 roles

Data Engineer

Aug 2024 – Mar 2026 · 1 yr 7 mos · On-site

Spearheading the migration of legacy PostgreSQL data pipelines to Spark SQL, resulting in a 40-60% performance improvement through optimized query execution and distributed data processing.
Leveraging Apache Spark to enable scalable data workflows, optimizing data processing and analytics for large-scale datasets.
Improved data pipeline performance through best practices like predicate pushdown, broadcast joins, and in-memory caching in Spark SQL, reducing data processing time and costs.
Collaborated with cross-functional teams to ensure smooth data migration while maintaining data integrity and ensuring minimal downtime.
Assisted in optimizing data storage and retrieval strategies, leading to more efficient queries and faster data delivery for downstream analytics.

DatabricksPySparkData EngineeringApache Spark

One Data Intern

Feb 2024 – Jul 2024 · 5 mos · On-site

Had a robust learning journey in data engineering, data visualization, advanced analytics and Gen AI.
Drawing from a robust foundation in web scraping techniques utilizing BeautifulSoup and Selenium, I've refined my skills through hands-on immersion, extracting invaluable data from leading e-commerce and video platforms such as Flipkart, Amazon, Myntra, Nykaa, and YouTube. Proficiency in both BeautifulSoup and Selenium empowers me to seamlessly navigate both dynamic and static web pages, significantly enhancing my web scraping capabilities.
I also completed a multifaceted project spanning diverse websites like Hugging Face and Data.world. Beyond mere data collection, I was committed to meticulously tracing data lineage, elevating the project's scope to a comprehensive level of understanding and utilization.
Moreover, my expertise extends beyond web scraping alone. I leveraged advanced Excel techniques for data visualization, analysis, and cleansing, enabling me to derive actionable insights with precision. Additionally, I've harnessed AWS technologies, utilizing crawlers to extract metadata, orchestrating AWS S3 Glue jobs for seamless data processing, and running queries using AWS Athena for efficient data analysis.
Furthermore, I've enriched my skill set through a dedicated course on data mining, augmenting my ability to uncover hidden patterns and trends within complex datasets. This holistic blend of technical proficiency and analytical prowess positions me to drive informed decision-making and extract maximum value from data-driven initiatives. My skill set is further enriched by a deep dive into Generative AI, NLP, machine learning, and deep learning, complemented by practical projects like one involving GPT-3.5 through OpenAI's API to compare and reconcile data between PDF invoices and Excel sheets, ensuring accuracy and efficiency.

SeleniumGoogle Cloud Platform (GCP)Web ScrapingData Engineering

Eatclub brands (formerly box8)

Data analyst intern

May 2023 – Jul 2023 · 2 mos

For optimal use of the Redash platform, I used advanced SQL concepts like arrays, CASE function, and join types. I applied
these concepts to company data, employing suitable joins based on table relationships and specific data needs.
Crafted concise and clear messages using the Vilpower website for late orders, duplicate orders, and orders that are out of
stock.
Utilized web scraping to extract latitude and longitude coordinates for major malls and cities, facilitating the process of
selecting optimal locations for EatClub outlets.

Software Development Life Cycle (SDLC)Financial Risk ManagementData AnalysisSQL