Aman Ghotra

Software Engineer

Bentonville, Arkansas, United States8 yrs 6 mos experience
Highly Stable

Key Highlights

  • Led data initiatives saving over $1.1M annually.
  • Architected cloud-native ETL pipelines with significant cost savings.
  • Mentored junior engineers to senior-level proficiency.
Stackforce AI infers this person is a Data Engineering expert with a strong focus on Cloud Architecture and AI/ML systems in Retail and E-commerce.

Contact

Skills

Core Skills

Cloud ArchitectureData EngineeringData Quality AutomationProject ManagementBusiness IntelligenceData Integration

Other Skills

DatabricksApache SparkPySparkPython (Programming Language)HiveHiveQLGoogle BigQueryGoogle Cloud Platform (GCP)ClickHouseScalaPythonAirflow/AstronomerBigQuerySAP SLT StreamingWorkday

About

Staff Software Engineer with 10+ years of expertise in enterprise data engineering, cloud architecture, and AI/ML systems. Currently leading large-scale data initiatives at Walmart Global Tech, processing 2.3M+ daily records and delivering $1.1M+ annual cost savings through innovative automation solutions. Currently leading a team of 7 engineers, including mentoring 3 fresh graduates to senior-level proficiency, fostering technical excellence, and driving innovation across enterprise data initiatives. --CORE TECHNICAL EXPERTISE: • Cloud Platforms: Google Cloud Platform (BigQuery, GCS, Dataproc), AWS (EC2, S3), Azure • Programming: Python, Scala, SQL, Shell Scripting, PHP • Big Data: Apache Spark, Hadoop, HDFS, Hive, Apache Kafka, Apache Airflow • AI/ML: LangChain, RAG Systems, Milvus Vector DB, Custom LLMs, Semantic Search • Databases: BigQuery, Snowflake, Redshift, MS SQL Server, Oracle, InfluxDB • ETL/Data Pipeline: Talend, Astronomer, Docker, Kubernetes, CI/CD --KEY ACHIEVEMENTS: 1. Built Enterprise RAG-Powered Document Intelligence System using Flask, Streamlit, Milvus, and LangChain for 1500+ documents with natural language querying 2. Architected Cloud-Native ETL Pipeline processing 7.5M Salesforce records across 1,079 columns, achieving 60% storage savings and 95% data transfer reduction 3. Built high-performance engineering culture achieving zero critical production issues across all major releases 4. Led Payroll Compare Testing Automation saving $1.1M annually and reducing execution time by 87% for 1.7M employees 5. Delivered Data Quality Automation certifying 350+ tables in one quarter, onboarding 1800+ datasets with 45% execution time reduction SPECIALIZED SKILLS: • Enterprise Data Architecture & Systems Design • Real-time Data Processing & CDC Implementation • Performance Optimization & Cost Reduction (90% BigQuery cost savings) • Cross-functional Leadership & Stakeholder Management • Data Quality Automation & Governance • Microservices Architecture & RESTful APIs • OAuth2 Authentication & Security Implementation INDUSTRY EXPERIENCE: Banking & Financial Services | Retail & E-commerce | Insurance | SaaS Platforms EU Blue Card Eligible #DataEngineering #CloudArchitecture #BigData #AI #MachineLearning #Python #Scala #ApacheSpark #BigQuery #DataScience

Experience

8 yrs 6 mos
Total Experience
1 yr 3 mos
Average Tenure
7 mos
Current Experience

Walmart

Staff Data Engineer (Merch Data and Agentic AI Foudations)

Nov 2025Present · 7 mos · Bentonville, Arkansas, United States · On-site

DatabricksApache SparkPySparkPython (Programming Language)HiveHiveQL+5

Walmart global tech

2 roles

Staff Data Engineer (People Data Datalake)

Promoted

Nov 2024Nov 2025 · 1 yr

  • Led enterprise-wide data architecture and pipelines for 2.3M employees across 8 countries, aligning HCM systems with analytics, payroll, and benefits.
  • Designed and deployed 12+ Workday EIB integrations for automated onboarding, offboarding, and compensation updates; implemented validation rules and business-process triggers to achieve 99.8% data accuracy.
  • Built Workday → BigQuery CDC pipeline using Workday Event Subscription Service and Kafka, processing 500K+ daily events with automated reconciliation and <5-minute discrepancy detection.
  • Architected a Workday–Big Data Lake reconciliation system validating 2.3M+ daily employee records; reduced manual effort by 95% and BigQuery costs by 90%.
  • Architected cloud-native ETL processing 7.5M Salesforce records (1,079 columns), delivering 60% storage savings and a 95% reduction in data transfer.
  • Designed containerized microservices and RESTful APIs aligned with SOA principles.
  • Implemented automated deployments, OAuth2 authentication, CDC, and high-uptime CI/CD pipelines using Looper, Jenkins, and Docker.
  • Partnered with HR data scientists and product teams to lead enterprise-wide overhaul and automation initiatives, delivering measurable business impact.
Apache SparkScalaPythonAirflow/AstronomerBigQuerySAP SLT Streaming+6

Senior Data Engineer

Nov 2022Nov 2024 · 2 yrs

  • Led enterprise-scale data initiatives across payroll automation, data quality, and dashboarding using Apache Spark, Scala, Python, Airflow/Astronomer, BigQuery, and SAP SLT Streaming.
  • Payroll Compare Testing Automation
  • Built a scalable ETL system comparing payroll data for 1.7M employees across platforms, saving $1.1M annually and reducing execution time by 87%. Integrated Spark, Scala, Python, GCS, Hive, and BigQuery.
  • SAP Payroll Report Transformation
  • Architected a reusable framework to parse and normalize 150+ unstructured SAP reports using regex and self-service ingestion, enabling advanced analytics and reducing manual effort.
  • Data Quality Automation & Certification
  • Led a cross-functional effort certifying 350+ tables in one quarter. Built a YAML-based automation tool for DQ rule execution, enabling onboarding of 1800+ datasets and cutting execution time by 45%.
  • Application Total Cost of Ownership Project:
  • Delivered automated dashboards for pre/post-load validation across 6 models, 35 tables, and 300 attributes—reducing validation effort by 80% and improving SLA turnaround by 66%.
  • Perfect Autopay US Dashboard
  • Designed and launched a payroll tracking pipeline in under a week to detect defects and minimize over/underpayments.
  • Tech Stack:
  • Python, Scala, Apache Spark, HDFS, Hive, Airflow/Astronomer, GCS Buckets, BigQuery, SAP SLT Streaming, Shell, Docker, Kubernetes, JIRA, Scrum, Architecture Design Reviews, Data Modeling
ScalaApache SparkPython (Programming Language)Google BigQueryApache AirflowApache Kafka+9

Acrotek it solutions inc

Big Data Engineer

Jan 2022Nov 2022 · 10 mos · Jersey City, New Jersey, United States · Remote

  • Led the end-to-end implementation of a data pipeline for a SaaS-based internal communications tool, enhancing the efficiency of mass messaging for the Corporate Communications team.
  • Captured and finalized project requirements, designed scalable processes, and supervised development activities.
  • Collaborated with technical leads to resolve complex challenges, ensuring on-time delivery and adherence to project milestones.
  • Maintained continuous stakeholder engagement, addressing queries and aligning technical execution with business needs.
  • Prevented a potential $100K annual financial impact by enabling successful deployment and usage of the SaaS platform.
  • Proactively mitigated risks and avoided timeline disruptions, streamlining execution and reducing rework.
  • Tech Stack:
  • GCS Buckets, HDFS, Python, Scala, Apache Spark, Shell, Airflow/Astronomer, BigQuery, Docker, Kubernetes
Apache SparkScalaShell ScriptingApache AirflowHiveProject Management+5

Ziontech solutions inc

Intern Business Intelligence Engineer

Jul 2020Jan 2021 · 6 mos · Milpitas, California, United States · Remote

  • Collected, analyzed, and shared data to drive improvements in key business metrics, customer experience, and overall business performance.
  • Designed, developed, and tested comprehensive BI solutions including databases, data warehouses, queries, views, reports, and dashboards.
  • Built and evaluated innovative BI tools and automated reporting systems using web and database technologies.
  • Executed data conversions, imports, and exports across internal and external software systems to ensure seamless data flow.
  • Integrated BI platforms with enterprise applications, enhancing accessibility and usability across teams.
  • Improved BI tool performance by implementing optimized data filters and indexes, resulting in faster and more reliable insights.
  • Documented new and existing models, solutions, and implementations to support scalability and maintainability.
  • Tech Stack:
  • Python, Tableau, Snowflake, Jira
TableauMicrosoft SQL ServerPower BIBusiness Intelligence

Dgliger consulting

Digital Engineer

Oct 2019Jan 2020 · 3 mos · Gurugram, Haryana, India · On-site

  • Led the architecture and development of a real-time dashboard for the CEO of State Bank of India Credit Cards, delivering actionable insights to support strategic decision-making.
  • Conducted stakeholder interviews to capture business requirements and authored detailed technical specifications.
  • Designed and implemented a scalable dashboard architecture to monitor key performance indicators (KPIs) including:
  • Platform-wise expenditure trends
  • Geographic distribution of customer sourcing for new accounts
  • Hourly and daily agent performance tracking
  • Ensured high availability and responsiveness of the dashboard through real-time data integration and monitoring.
  • Tech Stack:
  • Python, InfluxDB, Grafana, Prometheus
PythonInfluxDBGrafanaPrometheusData Engineering

Elevondata

2 roles

Senior Consultant

Promoted

Jan 2019Oct 2019 · 9 mos · Gurgaon, India

  • Geospatial Data Warehousing & Security Monitoring
  • Designed and implemented the destination data warehouse architecture to store and process geospatial data in cloud environments.
  • Developed preprocessing scripts and workflows to migrate data from legacy database servers to cloud-based platforms, enabling seamless daily and weekly data ingestion.
  • Defined key metrics and conducted a cost-benefit analysis across multiple cloud data warehouses (e.g., Snowflake, Redshift) to align with stakeholder requirements.
  • Set up a log collection and real-time alerting system using Splunk for a U.S.-based insurance client, aggregating logs from Azure Cloud, MS Office, and virtual machines to detect and respond to suspicious activity.
  • Enhanced data reliability and system performance through optimized workflows and proactive monitoring.
  • Tech Stack:
  • AWS EC2, S3, Talend, Alteryx, Hadoop, Snowflake, Redshift, MS SQL Server, Splunk, MapReduce
AWS EC2S3TalendAlteryxHadoopSnowflake+5

Consultant

Feb 2017Jan 2019 · 1 yr 11 mos · Gurgaon, India

  • Data Integration & Reporting for Transactional Systems
  • Collaborated with clients and vendors to gather business requirements and authored comprehensive technical specifications for daily transactional data consumption.
  • Designed and developed ETL workflows to ingest and normalize data into a centralized data warehouse.
  • Built data pipelines to transfer information from the warehouse to reporting databases and data lakes, supporting diverse analytics needs.
  • Created custom reporting interfaces and KPIs, delivering tailored reports for stakeholders at multiple organizational levels.
  • As part of managed services, implemented data quality checks and validations to ensure consistent and reliable execution of daily workflows.
  • Tech Stack:
  • MS SQL Server, Talend, Python, SFTP, Power BI
MS SQL ServerTalendPythonSFTPPower BIData Integration

Indiamart intermesh limited

Associate Software Developer

Jun 2016Feb 2017 · 8 mos · Noida, Uttar Pradesh, India

  • Preferred Number Service (PNS) to Buyer Intent Generation
  • Technologies: PHP, Oracle, cURL
  • Led the redesign of IndiaMART’s Preferred Number Service (PNS), enhancing the buyer-supplier call connection system for improved reliability and data capture.
  • Architected a new data ingestion pipeline to extract comprehensive metadata from PNS call instances and store it in a centralized data lake for downstream analytics.
  • Collaborated with external calling service vendors to design and implement a real-time, single-point API for continuous data exchange.
  • Developed robust data migration scripts to filter, validate, and transfer clean datasets from the data lake to the calling service database.
  • Conducted extensive testing of APIs and migration scripts across diverse data scenarios to ensure fault tolerance and data integrity.

Education

Saint Peter's University

Master of Science - MS — Data Science

Mar 2020Feb 2022

Dr B R Ambedkar National Institute of Technology, Jalandhar

Bachelor of Technology (BTech) — Computer Science and Engineering

Jan 2012Jan 2016

Stackforce found 100+ more professionals with Cloud Architecture & Data Engineering

Explore similar profiles based on matching skills and experience