Ankit Shukla — Product Manager

• A polyglot programmer with solid experience in Java, Python, and Scala working as Solution Architect at AWS. • Implemented 200+ full-stack Big Data Enterprise Data Platform on Cloudera, Hortonworks, Azure, GCP, and AWS. • GitHub Committer, author, speaker and trainer on various Big Data technologies. • Specialist in scaling an application, creating, administrating, troubleshooting a Big Data Cluster on Cloud. • More than decade of Kafka and Streaming Data products experience building from scratch. • Delivered various consultancy services which include Big Data Cluster creation, Administration, Data Migration from on-premise to Cloud, Monitoring, and Logging Architecture. • Currently working on Machine Learning, Computer Vision and Deep Learning in order to classify and recognize images in a video. • Strong background and experience in working with Frameworks like Spring, Hibernate, Testing, and Logging. • Experience in writing efficient code, configuring and deploying Enterprise Java Application using Tomcat webserver and IBM Websphere application server. • Expertise in Designing Data Lake on Azure, AWS, and GCP along with Dashboarding using PowerBI and Tableau for Big Data Analytics and Machine Learning. • Recently Designed and Developed custom solutions for Big Data CI/CD on Cloud. • Created various Analytics engine from scratch for both Batch and Streaming applications and a Proficient Apache Spark developer who knows how to create use case-specific batch and real-time data architectures and integration of different Big Data technologies with Spark. • Sound knowledge of various SQL and NoSQL databases. • Proven track record of working in an Agile with proficiency in mapping business requirements, technical documentation, application design, development, integration, testing, and bug fixing. • Expertise in Object-Oriented Analysis and Design and developing scalable System Design from end to end. • Sound knowledge of Data Structures and Algorithms. Also written wrapper of various data structures for distributed computing like “wrapper over RDD in apache spark”.

Stackforce AI infers this person is a Big Data Architect with expertise in cloud services and analytics solutions.

Location: Gurgaon, Haryana, India

Experience: 12 yrs 8 mos

Skills

Aws
Data Analytics
Big Data
Software Architecture
Machine Learning
Data Architecture
Cloud Migration
Data Visualization
Etl Development
Data Modernization
Data Processing
Analytics Engine Development

Career Highlights

Expert in building scalable Big Data architectures.
Proven track record in cloud migration and data modernization.
Strong background in machine learning and analytics solutions.

Work Experience

Amazon Web Services (AWS)

Solutions Architect II (4 yrs)

Xebia

Lead Big Data Consultant (Software Architect) (11 mos)

Senior Big Data Consultant (Tech Lead and Big Data Architect) (1 yr)

Big Data Consultant (Technical Lead and Big Data Architect) (5 mos)

Big Data Consultant (Technical Lead and Big Data Architect) (9 mos)

Big Data Consultant (Technical Lead) (5 mos)

Big Data Consultant (Senior Developer) (8 mos)

Big Data Consultant (Senior Developer) (11 mos)

Nagarro

Associate Technology (1 yr 4 mos)

Impetus

Software Engineer (1 yr 3 mos)

Associate Software Engineer (1 yr)

Education

Bachelor's Degree at Krishna Institute of Engineering and Technology

PCM at K.D.B. PUBLIC SCHOOL

Ankit Shukla

Product Manager

Gurgaon, Haryana, India12 yrs 8 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in building scalable Big Data architectures.
Proven track record in cloud migration and data modernization.
Strong background in machine learning and analytics solutions.

Stackforce AI infers this person is a Big Data Architect with expertise in cloud services and analytics solutions.

Contact

Skills

Core Skills

AwsData AnalyticsBig DataSoftware ArchitectureMachine LearningData ArchitectureCloud MigrationData VisualizationEtl DevelopmentData ModernizationData ProcessingAnalytics Engine Development

Other Skills

AgileAirflowAlgorithmsAmazon AuroraAmazon Web Services (AWS)Apache SparkApache StormAthenaAuroraAzureAzure Data FactoryAzure Data Lake StorageAzure DevOpsBash ScriptingBehavior-Driven Development (BDD)

About

Experience

12 yrs 8 mos

Total Experience

3 yrs 2 mos

Average Tenure

4 yrs

Current Experience

Amazon web services (aws)

Solutions Architect II

Jun 2022 – Present · 4 yrs · Gurugram, Haryana, India · On-site

1. Working backwards from customer requirements on solving complex data analytics problems at petabyte scale. Also, a member of Analytics Technical Field Community at AWS working deeply on products like EMR, MSK, OpenSearch, Glue, Lake Formation, Athena to solve customer problems.
2. Solid experience of building data products from inception to productionization on AWS with entrepreneurship mindset.
3. Member of Streaming Field SME group at AWS, helping customers to solve streaming problems globally, giving shape to Amazon MSK product by working closely with startup, enterprise, SMB, ISV and DNB segment customers.
4. Working upon Design, Development, Integration and Deployment aspects of Event driven applications, Batch and Streaming applications and their integration with ML Algorithms using containers on AWS and help customers to move from ideation to execution.
5. knows in-out of setting up variety of Data Platforms at scale at org level and drive innovation with technologies and solutions in the large-scale distributed cloud services and petabyte-scale data processing space.
6. Love to solve Data Structure problems, Low level design and High level design of different real world systems.
7. Polyglot programmer with hands-on experience of building products from scratch in Java and python.

AWSEMRMSKOpenSearchGlueLake Formation+2

Xebia

7 roles

Lead Big Data Consultant (Software Architect)

Jul 2021 – Jun 2022 · 11 mos

Main Duties:
1) Define and own end-to-end Software Architecture from definition phase to go-live phase for large and complex systems based on client requirements.
2) Build scalable data architectures for data ingestion, storage, transformation, and analysis.
3) Design platforms as consumable data services across the organization using the Big Data tech stack.
4) Work with Client Product Management and Operations in an agile environment to accomplish team goals.
5) Assemble large, complex data sets that meet functional / non-functional business requirements.
6) Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
7) Drive innovation with technologies and solutions in the large-scale distributed cloud services and petabyte-scale data processing space.
8) Create solutions as scalable generic reusable organization-wide platforms.
9) Collaborate to deliver rapid, iterative technology proof of concepts and support client's request for proposal response by providing cost-effective solutions.
10) Provide technical insights, delivering recommendations, tutorials, blog articles and technical presentations, adapting communication to different levels of technical partners.

Big DataData ArchitectureAgileData IngestionData TransformationData Analysis+1

Senior Big Data Consultant (Tech Lead and Big Data Architect)

Jul 2020 – Jul 2021 · 1 yr

Platform and Skills: Lambda, SNS, SQS, python, Java, REST APIs, ML Algorithms, MSK, Aurora (Postgres), Fargate, Redshift, DMS, Terraform, Cloudwatch, DynamoDB, DocumentDB, S3, Jenkins
Project Description: Due to COVID-19, Customers are not visiting client's physical stores but using website to purchase product items more via online, and as per predictive analysis there will be a huge increase in online order sales during holidays in 2020. So this is an initiative to optimize the margin associated with shipments per online order and also to minimize split in shipments. It's an inventory operation focused project which has a direct impact on profit per online order and making order fulfilment lifecycle more intelligent via Machine Learning models.

LambdaSNSSQSPythonJavaML Algorithms+13

Big Data Consultant (Technical Lead and Big Data Architect)

Jan 2020 – Jun 2020 · 5 mos

Platform and Skills: Spark, Scala, Python, EMR, S3, Redshift, Lambda, Glue, Athena, Redshift Spectrum, SQS, Cloudwatch, DataPipeline, EC2
Project Description: The Consumer Data Hub (CDH) serves as the technology foundation and Data Platform to connect disparate data from a single, accessible source. Main objective is to understand the 360-degree view of consumer behaviour collected from data across multiple channels (Online, Email, In-Store, Social, others) to make available: a unified consumer profile, cross-channel consumer
interactions, consumer preferences, rich consumer attribution, and consumer suggestions.
Key responsibilities include:
1) Customers' trusted advisor: collaborate with customer/AWS partner account to deliver highly efficient and cost-effective Data solutions.
2) Responsible for creating and optimizing Spark Jobs in production environment both infrastructure and code wise.
3) Being a Big Data subject matter expert, key role is to present best Data solutions and Design from
ideation to execution along with enterprise best practices in Coding for scalable solutions.
4) Heavily engaged in Customer Data Platform Development on AWS in Design, Development, Infrastructure, Integration, Automation, Testing, Design Patterns and Orchestration.
5) Presented Data Migration Approach for various data sources along with Architecture and associated costs.
6) Algorithmic Implementation for a module and it's automated deployment along with orchestration.

SparkScalaPythonEMRS3Redshift+10

Big Data Consultant (Technical Lead and Big Data Architect)

Mar 2019 – Dec 2019 · 9 mos

Platform & Skills: Spark, Hive, Scala, Azure, Cloudera, Azure Data Lake Storage, Azure DevOps, Splunk, Elasticsearch, Flume, Kafka, Oozie, Jenkins
Brief: Purpose of this project is to migrate Data Platform (named Helix 2.0) Tier-3 environment (Hadoop storage, Batch and Stream Processing) from on-premise to Cloudera on Azure (Iaas followed by ADLS Gen 2), implement security enhancements, build a common ingestion layer capability for Helix cloud platform. Improve product stability with separation of Test, UAT and Production environments, automate the deployment of compute and storage resources in the cloud.
Key responsibilities include:
1) Leading an onshore team in all key milestones, collaboration with client teams for requirement gathering and freezing, demoing deliverables on a weekly basis and key performer to represent the team at client location in front of various stakeholders including Product Owners, Senior Solution Architects to include 360-degree feedback and it’s implementation using Agile methodologies.
2) Setup of clusters for all environments and creation of deployment packages for deploying the EDH/ Helix foundational components.
3) Implement Physical Design in Azure for EDH requirements
4) Implement CI/CD Pipeline for Batch and Streaming Applications in the cloud for EDH.
5) Complete the data migration for the project with monitoring and data reconciliation for every component of the data platform.
6) Carry out performance tests for Application and Infrastructure Benchmarking.
7) Implementation of services for Monitoring and Logging Architecture on Azure.

SparkHiveScalaAzureClouderaAzure Data Lake Storage+9

Big Data Consultant (Technical Lead)

Oct 2018 – Mar 2019 · 5 mos

Platform & Skills: Spark, Hive, Scala, Azure, PowerBI, Structured Streaming, Event Hub, Azure Data Lake Storage, Azure Data Factory, STRIIM, MySQL, Campaign Management Tools like Oracle Eloqua
Brief: The purpose of this product is to build an enterprise Data Platform on the cloud using Big Data Stack for building ETL solution along with real-time dashboarding using PowerBI. The client is having property and entertainment business and wants to leverage the power of Big Data Stack on the cloud for building a data platform. This solution is built on data lake which has different layers like - Ingestion Layer, Processing Layer, Storage Layer and finally KPI is calculated for a specific use case on PowerBI, using both batch and streaming applications.
Key Responsibilities include:
1) Provides architecture for an end to end Big Data Development on Azure.
2) Responsible for designing high level and low-level design of Data Platform, as well as its implementation and deployment.
3) Collaboration with different stakeholders and teams in an agile manner.
4) Optimization of Spark jobs and troubleshooting in different environments.
5) Responsible for active development and enhancement of various layers like Data Migration, Data Ingestion, Speed layer which includes PB and TB amount of customer property and entertainment data onto Azure cluster.
6) Efficiently leading a complete data development life cycle including Design, Development, Deployment, and Testing.

SparkHiveScalaAzurePowerBIStructured Streaming+7

Big Data Consultant (Senior Developer)

Feb 2018 – Oct 2018 · 8 mos

Platform & Skills: Hadoop, Sqoop, Hive, Spark, Python, pyspark, Bash Scripting, Airflow, Redis, GCP
Brief: The purpose of this product is to build a retail analytics solution on the cloud using a pyspark for building an ETL solution. Some goals are to deliver market-based segmentation (retail analytics algorithm) based on customer buying patterns. This ETL solution is built on data lake which has different layers like - Ingestion Layer, Caching Layer, Processing Layer and various downstream systems leverage information from this data lake based on their requirements.
Key responsibilities include:
1) Provides approach and write retail analytics ETL solution using Big Data Technologies.
2) Creating a high- and low-level design of the assigned module.
3) Behavior Driven Development of scalable architecture using Big Data technologies.
4) Automate deployment process using GitLab CI/CD
5) Migrate existing retail domain data to the cloud (Google Cloud Platform).

HadoopSqoopHiveSparkPythonpyspark+6

Big Data Consultant (Senior Developer)

Mar 2017 – Feb 2018 · 11 mos

Platform & Skills: Hadoop, Sqoop, Hive, Spark, HBase, Bash Scripting, Java, Python, Scala, Kafka, Flume, MongoDB
Brief: Data Modernization- A data platform whose objective is to generate insights from the data of different categories of Investment bank. It takes data in EBSDIC format, performs cleaning of data, and push it in data lake where all business logic is there to deal with different use cases. The main objective is to transfer data to Hadoop, create an analytics layer on top of it using Hive and provide insights to Business users and different other platforms when needed using various Big Data technologies.
Roles and Responsibilities:
 Worked at the client location, handling investment bank data by leveraging power of Big Data technologies.
 Worked on the Development of Real-time streaming architecture from scratch.
 Responsible for designing high-level and low-level design of Data platform, as well as it's implementation and deployment in development and QA environment.
 Build complex and non-interactive systems batch, distributed, etc in the project.
 Developed efficient shell scripting jobs that could minimize the code work and increase the
efficiency of the whole process.
 Responsible for active development and enhancement of various layers like Data Migration, Data Ingestion which includes PB and TB amount of data of various monetary credit card transactions and customer details into the Hadoop cluster.
 Generating insights from the huge amount of customer data which involves use case like Fraud Detection, Customer 360 degree, Benefits and Values, etc
 Developed and Contributed in existing Java, python and scala applications which facilitated the process to ingest data in Hadoop ecosystem in an efficient manner

HadoopSqoopHiveSparkHBaseBash Scripting+8

Nagarro

Associate Technology

Oct 2015 – Feb 2017 · 1 yr 4 mos · Gurugram, Haryana, India

Current role involves following:
Design and Development of robust data processing pipelines for enterprise-level business analytics.
Using Big Data Technologies like Apache Spark, Elasticsearch, MongoDB etc to implement analytics engine.
Learning Predictive Analytics, Content Analytics and Machine Learning.
Recently worked on creating ETL tool for solving one of the analytics problem.
Provide solution of various analytics engine problems and then implement it.
Actively involved and implement various phases of analytics engine development from scratch like: Data Discovery, Data Exploration, Data Prediction, Data Visualization and Dashboards.
Working with smart and innovative team for solving various complex parallel data processing problems.

Big Data TechnologiesApache SparkElasticsearchMongoDBData ProcessingAnalytics Engine Development

Impetus

2 roles

Software Engineer

Promoted

Jul 2014 – Oct 2015 · 1 yr 3 mos

The role involves:
Writing efficient code and documenting it as per the design specification of module / component
Deploying components and/ or applications
Unit Testing, debugging and maintaining code in ones own development environment
Preparing the low level design document if required
Low level scheduling of the Module
Interacting with Senior Software Engineers to get the requirements/design
Reviewing the code
Learning release cycle and performing in various phases including packaging and deployment

Associate Software Engineer

Jul 2013 – Jul 2014 · 1 yr

Writing efficient code and documenting it as per the design specification of module/component.
Deploying components and/ or applications.
Unit testing, debugging and maintaining code in project development environment.
Problem-solving and thinking laterally as part of a team, or individually, to meet the needs of the project.
Participating in design discussions and analyze the problem in detail to understand the key points.