Deepak Mahato

Director of Engineering

Bengaluru, Karnataka, India14 yrs 2 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Led multi-million-dollar cloud migrations.
Achieved significant cost savings through optimization.
Built high-performing teams in data infrastructure.

Stackforce AI infers this person is a Cloud Infrastructure and Data Operations expert in the SaaS industry.

Contact

Skills

Core Skills

Cloud InfrastructureData OperationsCloud Cost Optimization

Other Skills

AWSAmazon EC2Amazon Elastic MapReduce (EMR)Amazon Relational Database Service (RDS)Amazon Web Services (AWS)AnsibleApache SparkApache SqoopBig DataBig Data AnalyticsBigTableCloud StorageCloudera SecurityCost OptimizationData Analytics

About

Innovative Data Platform Infrastructure Manager with 13+ years of experience leading cloud and Data Infrastructure transformations for global enterprises. Expertise in GCP, AWS, Hadoop, BigQuery, Keboola, Teradata, and Data Lakes, driving multi-million-dollar cost optimizations, cloud migrations, and AI-powered automation. Recognized for building high-performing teams and aligning infrastructure strategy with business growth and cost efficiency.

Experience

14 yrs 2 mos

Total Experience

3 yrs 6 mos

Average Tenure

5 yrs 3 mos

Current Experience

Groupon

3 roles

Engineering Manager

Promoted

May 2024 – Present · 2 yrs

Leading a team of 6 engineers managing diverse technologies to ensure the continuous operation of the organization’s Data Platform Infrastructure, including the Datalake and Data Warehouse.
Oversaw large-scale cloud migration projects, focusing on reducing operational costs and improving system efficiency.
Directed the migration of Hadoop Hortonworks clusters to GCP Dataproc, successfully transitioning 6 PB of data without disrupting production pipelines.
Championed the Teradata footprint reduction project, optimizing CPU and IO utilization to downscale from a 42-node AWS cluster, achieving ~12.69% annual savings (~$1.256M over three years).
Implemented GCP Data Lake lifecycle management to reduce data redundancy by 80%, leading to $1.5M annual cost savings.
Coordinated with cross-functional teams and stakeholders to migrate and stabilize Hadoop, Spark, Hive, and HBase pipelines on GCP infrastructure.
Facilitated the successful migration of HBase workloads to BigTable with minimal downtime and optimized pipeline performance.
Conducted team performance reviews, mentoring, and skill development initiatives to build a high-performing team.
Played a key role in the organization’s cost optimization initiative, achieving a 50% reduction in recurring costs by transitioning GCS buckets from US multi-region to single-region setups achieving total savings of $2.4million annually

Data OperationsBig DataCloud InfrastructureGCPHadoopData Lakes+1

Senior BigData Cloud Engineer (SDE- IV)

Jul 2023 – May 2024 · 10 mos

Primary Focus: Cloud Cost Optimization
Implementation of multiple strategies aimed at reducing cloud costs.
Successfully saved millions of dollars annually for the company post-migration to the cloud.

Cloud Cost OptimizationCloud Infrastructure

Database Engineer III

Feb 2021 – Aug 2023 · 2 yrs 6 mos

#Operations
Working and Managing Apache Hadoop with 3k+ Hadoop Node cluster with 30+ Petabytes of Data.
Supporting & Managing AWS Data Lake (EMR, S3, Glue), GCP BigData (Composer, DataProc, Storage, Big Table)
#Migration
Working on the migration of the legacy On-prem Hadoop Clusters to Google Cloud.
Installation and gathering of the metrics from the third-party tool - Unravel Data which helps to segregate the workload on the cluster and plan the migration effectively with reduced costs.
Working closely with the Google team on the Technical Design Document for Moving the workload to the cloud
Working on the Designing and Documentation of Google Cloud Storage Bucket Design and Security
Creating a Plan to Migrate the 5 PetaBytes of Data from On-prem to GCP Cloud Storage.
Designing and Planning the Deployment of Data Proc Cluster as per the Customers & batch Jobs requirement.
Working on the Interactive data access and supporting cluster setup for customers and users ad-hoc query, tableau refresh API, JDBC Connectivity, and for connecting the in-house tools.
Create an end-to-end setup that enables a BI tool to use data from a Hadoop environment.
Authenticate and authorize user requests.
Set up and use secure communication channels between the BI tool and the cluster using (Dataproc, Cloud SQL, Cloud Storage, Terraform, and Git)
Worked on the plan to migrate the entire Hadoop workload to the AWS cloud.
Worked on the end-to-end implementation of Data Lake on AWS Cloud mostly from the Data security part which includes Authentication, Authorization, Encryption, and Audit.
Also, on the multiple POC like Amazon EMR integration with Apache Ranger and AD Kerberos, AWS Data Glue Catalog, and AWS Lake Formation.
#gcp #aws #hadoop #dataproc #aws #storage #sql #terraform #git #infrastructure #datalake #teradata #datawarehouse #bigtable #composer #apache #opensource

HadoopAWSGCPData LakeSQLTerraform+2

Accenture in india

2 roles

Big Data and Platform Specialist

Dec 2019 – Jan 2021 · 1 yr 1 mo

Worked for a Data Lake Team for AIP ( Accenture Insight Platform ) as Data Lake SME Supporting multiple clients for their Data Lake Environment on AWS/GCP/Azure Cloud Platform for Cloudera/Hortonworks/EMR/DataProc/HDInsight Distribution of Hadoop.
Designed/Installed/Configured/Maintained/Upgrade multiple distributions of Hadoop cluster for application Development/Production.
Designing multi-node clusters for the production environment based on future data growth.
Sizing of the clustering exercise was performed along with stakeholders to understand the data ingestion pattern and provided recommendations.
Designing and implementing non-production multi-node environments.
Also, Worked on CDP to migrate the data from legacy data warehouses to CDP and migrated the data from Hadoop clusters to various cloud object storages and cloud data warehouses.

Data LakeHadoopAWSGCPData OperationsCloud Infrastructure

Cloud Ops Sr. Hadoop Administrator

Apr 2017 – Nov 2019 · 2 yrs 7 mos

Worked with Teradata vendors and SMEs Implementing the Teradata Solution on AWS.
Migrated the Dataset from Hadoop System to Teradata via Sqoop.
Implemented Access policy, BAR Job Setup, Teradata Viewpoint Monitoring, and Alerts and also streamlined the process for onboarding new users and team on the Teradata Systems.
Managed the Teradata System EMEA and NA region and its day-to-day operations.
Managed Hadoop Clusters with 90+ nodes for prod, 54+ Dev nodes, and overall 700+ TB of data. Integrating various components to achieve the DataLake solution model effectively. Involved in various cloud components like AWS, Major components of Cloudera, Talend and RStudio, and Teradata.
Administering & Architecting Hadoop Stacks like Hbase, Hive, Impala, Hue, Sqoop, and Solr.
Implemented CDH cluster in AWS cloud platform using Clouder Director.
Adding security to the Hadoop Cluster using Kerberos

TeradataHadoopAWSData OperationsCloud Infrastructure

Groupon

Database Administrator

Jan 2016 – Mar 2017 · 1 yr 2 mos

As a Teradata Administrator
As Teradata DBA, responsible for maintaining all DBA functions (development, test, production) in operation 24×7.
✓ Reviewing technical design document (TDD) Performance tuning, including collecting statistics, analyzing explains & determining which tables needed statistics.
✓ Worked on creating and managing the partitions.
✓ Deliver new and complex high-quality solutions to clients in response to Varying business requirements and Creating and managing user accounts.
✓ Monitoring and reporting Review system utilization by the user.
✓ Performance Tuning of bad SQL queries.
✓ Creation and maintain users, passwords, spool and temp, space limits as required.
✓ Creation and maintenance of databases and objects (tables/views/macros/procedures).
✓ Compression analysis and implementation using different compression methods (MVC).
✓ Worked on Teradata utilities like BTEQ, Multiload, Fast load, Fast export, and SQL assistant.
✓ User Management – Creation and managing Users, Databases, Roles, Profiles, and Accounts.
✓ Space Allocation – Assigning Permanent Space, Spool Space, and Temporary Space
✓ Access of Database Objects – Granting and Revoking Access Rights on different database objects.
✓ System Maintenance – Specification of system defaults, restart, etc.
✓ Resource Monitoring – Database Query Log (DBQL) and Access Logging.
✓ Data Archives, Restores and Recovery – ARC Utility and Permanent Journals.
✓ Space Management- Purging the data and applying MVC to huge tables
✓ BAR- DSU & ABU

TeradataHadoopSQLData Operations

Wipro

Project Engineer

Nov 2011 – Dec 2015 · 4 yrs 1 mo · Bangaon, West Bengal, India

✓ Worked as Teradata Developer for one of the Banking client Lloyds Bank for initially for about 15 months.
✓ As a DBA I am responsible for the end to end delivery of the project which includes a Review of technical Documentation and provide necessary comments for Performance Tuning. Developing Macros and procedure as per the business requirement to Simplifying the Daily activities like access issues, Environment setup, etc.
✓ Experience in Teradata utility like Fastload, Tpump, Multiload Maintenance/development of Teradata Archive\Restore job in the Bar sever. Support the live Implementation of the project. Experienced in using Teradata viewpoint, Net backup, TARA, Administrator

TeradataHadoopData Operations