Esha Aishwarya

Operations Associate

Bengaluru, Karnataka, India1 yr 9 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Achieved 60% performance gains in data migration projects.
  • Expert in building automated data solutions using Azure Databricks.
  • Passionate about Generative AI and Machine Learning.
Stackforce AI infers this person is a Data Engineer specializing in SaaS and AI/ML solutions.

Contact

Skills

Core Skills

Data EngineeringApache SparkPerformance OptimizationWeb ScrapingData GovernanceMachine LearningData AnalysisSql

Other Skills

DatabricksPySparkPostgreSQLSpark SQLData MigrationSeleniumGoogle Cloud Platform (GCP)Data LineageModel Metadata ExtractionBeautifulSoupData ExtractionSoftware Development Life Cycle (SDLC)Financial Risk ManagementGreenplumRaspberry Pi

About

As a Data Engineer at Genpact, I specialize in building scalable, optimized, and automated data solutions that enhance efficiency and insights. Starting as a One Data intern and moving into a full-time role, I’ve developed expertise in cloud data engineering, pipeline design, and AI-driven automation. Currently, I work on a multi-client invoice processing product using Azure Databricks, SQL, SFTP, and Databricks Workflows. I design and optimize pipelines to automate invoice data ingestion, transformation, and validation from multiple sources, ensuring accuracy, traceability, and consistency. The solution follows a Medallion architecture (Bronze–Silver–Gold), with curated SQL views above the Gold layer powering Power BI dashboards for client and financial reporting. I also create Python automation scripts for workflow orchestration, error handling, and monitoring, improving reliability and reducing manual effort. Earlier, I led a data migration project on Databricks, converting complex PostgreSQL pipelines into Spark SQL, achieving up to 60% performance gains through distributed processing and query optimization. I handled code translation, validation, and benchmarking to ensure seamless migration and reliable delivery. The data model followed the Bronze–Silver–Gold pattern, with Gold-layer views integrated into Tableau for real-time business reporting and insights. During my internship, I contributed to a Hugging Face project, implementing data lineage tracing and model metadata extraction to improve ML model transparency and governance. This enhanced model usability and collaboration across teams. I’ve also done web scraping using BeautifulSoup and Selenium, extracting insights from Amazon, Flipkart, Nykaa, and YouTube for analytics. Additionally, I built AI-based tools using OpenAI’s GPT-3.5 to automate PDF invoice and Excel reconciliation, improving accuracy and efficiency. I’m skilled in AWS (S3, Glue, Athena) for ETL workflows and proficient in Excel for analysis and visualization. My strengths lie in data integration, performance tuning, and automation. I’m passionate about Generative AI, NLP, and Machine Learning, continually exploring them through hands-on projects and certifications. Tech Stack: Spark SQL | Azure Databricks | PostgreSQL | PySpark | SQL | Python | Power BI | Tableau | AWS (S3, Glue, Athena) | SFTP | Databricks Workflows | Pandas | NumPy | Scikit-learn | BeautifulSoup | Selenium | Hugging Face | Git | Excel | OpenAI API

Experience

1 yr 9 mos
Total Experience
1 yr 6 mos
Average Tenure
2 mos
Current Experience

Wells fargo

Data Management Associate

Mar 2026Present · 2 mos · Hyderabad, Telangana, India · Hybrid

Genpact

2 roles

Data Engineer

Aug 2024Mar 2026 · 1 yr 7 mos · On-site

  • Spearheading the migration of legacy PostgreSQL data pipelines to Spark SQL, resulting in a 40-60% performance improvement through optimized query execution and distributed data processing.
  • Leveraging Apache Spark to enable scalable data workflows, optimizing data processing and analytics for large-scale datasets.
  • Improved data pipeline performance through best practices like predicate pushdown, broadcast joins, and in-memory caching in Spark SQL, reducing data processing time and costs.
  • Collaborated with cross-functional teams to ensure smooth data migration while maintaining data integrity and ensuring minimal downtime.
  • Assisted in optimizing data storage and retrieval strategies, leading to more efficient queries and faster data delivery for downstream analytics.
DatabricksPySparkData EngineeringApache Spark

One Data Intern

Feb 2024Jul 2024 · 5 mos · On-site

  • Had a robust learning journey in data engineering, data visualization, advanced analytics and Gen AI.
  • Drawing from a robust foundation in web scraping techniques utilizing BeautifulSoup and Selenium, I've refined my skills through hands-on immersion, extracting invaluable data from leading e-commerce and video platforms such as Flipkart, Amazon, Myntra, Nykaa, and YouTube. Proficiency in both BeautifulSoup and Selenium empowers me to seamlessly navigate both dynamic and static web pages, significantly enhancing my web scraping capabilities.
  • I also completed a multifaceted project spanning diverse websites like Hugging Face and Data.world. Beyond mere data collection, I was committed to meticulously tracing data lineage, elevating the project's scope to a comprehensive level of understanding and utilization.
  • Moreover, my expertise extends beyond web scraping alone. I leveraged advanced Excel techniques for data visualization, analysis, and cleansing, enabling me to derive actionable insights with precision. Additionally, I've harnessed AWS technologies, utilizing crawlers to extract metadata, orchestrating AWS S3 Glue jobs for seamless data processing, and running queries using AWS Athena for efficient data analysis.
  • Furthermore, I've enriched my skill set through a dedicated course on data mining, augmenting my ability to uncover hidden patterns and trends within complex datasets. This holistic blend of technical proficiency and analytical prowess positions me to drive informed decision-making and extract maximum value from data-driven initiatives. My skill set is further enriched by a deep dive into Generative AI, NLP, machine learning, and deep learning, complemented by practical projects like one involving GPT-3.5 through OpenAI's API to compare and reconcile data between PDF invoices and Excel sheets, ensuring accuracy and efficiency.
SeleniumGoogle Cloud Platform (GCP)Web ScrapingData Engineering

Eatclub brands (formerly box8)

Data analyst intern

May 2023Jul 2023 · 2 mos

  • For optimal use of the Redash platform, I used advanced SQL concepts like arrays, CASE function, and join types. I applied
  • these concepts to company data, employing suitable joins based on table relationships and specific data needs.
  • Crafted concise and clear messages using the Vilpower website for late orders, duplicate orders, and orders that are out of
  • stock.
  • Utilized web scraping to extract latitude and longitude coordinates for major malls and cities, facilitating the process of
  • selecting optimal locations for EatClub outlets.
Software Development Life Cycle (SDLC)Financial Risk ManagementData AnalysisSQL

Internshala

Internshala Student Partner

Sep 2021Oct 2021 · 1 mo · Remote

Education

Vellore Institute of Technology

Bachelor of Technology - BTech — Computer Science

Jan 2020Jan 2024

Stackforce found 100+ more professionals with Data Engineering & Apache Spark

Explore similar profiles based on matching skills and experience