Amit Bugalia

Software Engineer

Bengaluru, Karnataka, India10 yrs 3 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • Over 9 years of IT experience with Big Data specialization.
  • Expert in architecting scalable data pipelines and optimizing data workflows.
  • Proficient in managing terabytes of complex data daily.
Stackforce AI infers this person is a Big Data Engineer with expertise in SaaS data solutions.

Contact

Skills

Core Skills

Big DataData EngineeringData VisualizationSoftware Development

Other Skills

Amazon Elastic MapReduce (EMR)Apache HudiApache KafkaApache OozieApache RangerApache SparkApache Spark StreamingCore JavaData StructuresDatabricksDatadogDelta LakeEclipseExtract, Transform, Load (ETL)HBase

About

Over 9 years of IT experience, with 8+ years specializing in Big Data technologies, including Hadoop ecosystems. Proficient Big Data developer with extensive hands-on experience in Spark Streaming & Batch, Apache Hudi, Kafka, Delta Tables, HDFS, Apache Ranger, MapReduce, Yarn, HBase, Zookeeper, Airflow, and multiple file formats such as Avro, Parquet, JSON, and XML. Skilled in managing and processing terabytes of complex, high-volume data daily using advanced tools like SQL, GIT, BitBucket, Maven, and AWS.

Experience

Eightfold ai

Staff Software Engineer

Oct 2024Present · 1 yr 5 mos

Teikametrics

Lead Software Engineer, Data Platform

Oct 2021Oct 2024 · 3 yrs · Bangalore Urban, Karnataka, India

  • Architected and implemented a highly scalable data pipeline leveraging Delta Lake, efficiently handling 5TB of daily data ingestion through Spark Structured Streaming and batch processing on Databricks.
  • Optimized Looker dashboards by designing a read-optimized presentation layer on Unity Catalog, improving performance, and successfully migrating complex queries from Snowflake to Databricks.
  • Developed a comprehensive cluster utilization and monitoring framework for Databricks, seamlessly integrating with Datadog and OpenSearch to ensure efficient resource management and real-time insights.
  • Engineered advanced partition pruning and Delta table parsing strategies, reducing infrastructure costs and significantly enhancing execution times for large-scale data workflows.
Delta LakeSpark Structured StreamingDatabricksLookerDatadogOpenSearch+2

Delhivery

2 roles

Lead Data Engineer

Jan 2021Sep 2021 · 8 mos

  • Managed a data warehouse handling 500 GB of daily volume, leading a team of 6 engineers responsible for the data platform’s ETL and streaming operations.
  • Integrated Apache Hudi into a large-scale data pipeline using Spark Streaming over S3 and Kafka, alongside developing a robust health monitoring framework for pipeline stability.
  • Implemented a query parsing framework to enhance data module quality and optimize cloud infrastructure costs.
  • Led cross-functional collaboration with various teams to establish a comprehensive data lake pipeline for application data.
Apache HudiSpark StreamingKafkaBig DataData Engineering

Senior Software Engineer

May 2019Jan 2021 · 1 yr 8 mos

  • Designed and owned the Cost Per Shipment (CPS) system, implementing an activity-based costing model using batch-based historical data analysis with Spark DataFrames for precise shipment-level cost allocation.
  • Resolved complex Spark issues including concurrent writes, memory management, shuffle inefficiencies, and write failures, gaining deep expertise in tuning Spark internals for optimal performance.
  • Introduced predictive analytics leveraging historical cost data, and implemented an outlier detection framework to ensure data accuracy and reliability.
  • Managed financial dashboards for PnL and other key finance metrics, using processed data from CPS jobs to provide actionable insights.
Spark DataFramesPredictive AnalyticsBig DataData Engineering

Nagarro

3 roles

Senior Associate

Promoted

Jan 2018Apr 2019 · 1 yr 3 mos · Gurugram, Haryana, India

  • Creating a 360-degree customer database for the client using data sourced from multiple data sources, which will contain unique customer data to be used for marketing purposes. Database will help segment customers and identify customer RFM, customer churn and customer lifetime value using SAS analytics
  • Worked on Implement business logic of customer id generation and customer 360 variable generation.
  • Worked on Storing Normalized and filtered raw data provided by multiple data sources in NoSQL Database HBase. Along with saving intermediate stages data in parquets in HDFS.
SASHBaseHDFSData Engineering

Associate

Sep 2016Dec 2017 · 1 yr 3 mos · Gurugram, Haryana, India

  • Developed a system to generate CSV files from SDP files using an N-array tree structure in XML, optimizing data transformation and storage.
  • Parsed complex XML trees using DOM parser, converting them into N-array POJO trees with Core Java for efficient data manipulation.
  • Optimized CSV file size by implementing dynamic virtual links to break down N-array trees in a streamlined manner.
  • Engineered algorithms to simplify tree structures into linear representations, enabling seamless CSV export, and designed configuration files using JSON to handle diverse XML parsing requirements.
XMLCore JavaSoftware Development

Junior Associate

Aug 2015Aug 2016 · 1 yr · Gurugram, Haryana, India

Education

Indian Institute of Technology (Banaras Hindu University), Varanasi

Bachelor's degree — Electrical and Electronics Engineering

Jul 2011May 2015

gudha public school,jhunjhunu(rajasthan),india

Jan 2007Jan 2010

Stackforce found 100+ more professionals with Big Data & Data Engineering

Explore similar profiles based on matching skills and experience