Shaailesh N.

Software Engineer

Edinburgh, Scotland, United Kingdom9 yrs 7 mos experience

Most Likely To Switch

Key Highlights

Expert in designing robust data solutions.
Proven track record in optimizing data pipelines.
Hands-on experience with real-time stream processing.

Stackforce AI infers this person is a Data Engineering expert with a strong focus on Big Data Analytics.

Contact

Skills

Core Skills

Data EngineeringBig Data Analytics

Other Skills

AWS LambdaAkkaAmazon CloudWatchAmazon KinesisAmazon RedshiftAmazon Web Services (AWS)Apache AirflowApache DruidApache FlinkApache IcebergApache KafkaApache SparkBig DataBigQueryConfluent

About

Experienced Data Engineer and Software Developer adept at designing and implementing robust data solutions to support large-scale operations. Skilled in SQL, Spark, and Kafka, with a proven track record of optimizing data pipelines to handle massive volumes of data eciently. Procient in data modeling and architecture, with hands-on experience in building real- time stream processing systems. Strong background in software development, API integration, and campaign management. Excited to leverage expertise in driving data- driven insights and innovation.

Experience

9 yrs 7 mos

Total Experience

1 yr 11 mos

Average Tenure

3 yrs 1 mo

Current Experience

Skyscanner

Software Engineer

Apr 2023 – Present · 3 yrs 1 mo · Edinburgh, Scotland, United Kingdom · On-site

TerraformGithub actions

Dream11

SDE-2/Data Engineer

May 2021 – Mar 2023 · 1 yr 10 mos · Mumbai, Maharashtra, India · On-site

Data Migration Expertise:
Spearheaded the migration initiative from a traditional Redshift warehouse system to a scalable and efficient data lake infrastructure.
Successfully executed migration tasks, ensuring seamless transition and minimal disruption to ongoing operations.
Adoption of Apache Iceberg:
Implemented Apache Iceberg as the storage format for the data lake, enhancing data reliability, versioning, and schema evolution capabilities.
Conducted a proof of concept (POC) on Trino DB to serve as a compute layer for Apache Iceberg, optimizing query performance and data retrieval.
Big Data Optimization:
Worked on optimizing Spark jobs on Amazon EMR clusters, improving processing speed and resource utilization.
Implemented best practices to enhance the overall efficiency of the big data processing framework.
Data Modeling:
Applied expertise in data modeling to design efficient and scalable data structures, ensuring optimal data organization and storage.
Real-time Data Platform:
Played a key role in developing and maintaining a real-time data platform capable of handling an event concurrency of 200 million requests per minute and processing 24 billion records.
Utilized Apache Flink to create a robust real-time alerting mechanism, enhancing the system's responsiveness to critical events.
KPI Report Generation:
Developed KPI report generation processes to provide actionable insights to stakeholders, enabling informed decision-making.
Ensured the accuracy and timeliness of key performance indicator reports.

DatabricksAmazon RedshiftAmazon Web Services (AWS)Apache FlinkApache SparkData Engineering+1

Paytm

Data Engineer

Sep 2018 – May 2021 · 2 yrs 8 mos · Noida, Uttar Pradesh, India

1. Pulse - Real Time Stateful Stream Processing & Analytics System (Fast Data)
Flink on YARN - Realtime Stateful Stream Transformation
DRUID - A TimeSeries database
Ingestion time data cubes to achieve Sub-second query response
ThetaSketches - count distinct values over incrementally rolled up data.
Analytics Dashboards - forked Apache-Superset
Eagle Eye (Health and performance monitoring) - Play framework, Scala, ELK Stack
Batch Processing & Data Warehousing - Parquet, AWS S3, Hive, AWS Athena
more details in Projects section
2. GA Facts & Data Pipeline
Conceptualised & Created a complete SPARK, BigQuery & Python based framework to partially
process data on Google BigQuery and partially in our Spark clusters to produce business useful
insights.
'GA Facts' completely changed the way GA data was leveraged at PayTM as it solved multiple use-
cases across different business verticals.
Optimised batch data processing (1TB/Day Google Analytics clickstream Data) from a staggering
21hrs to 4.5hrs.
3. ElasticSearch Based Near Realtime Pipeline with Grafana.
Optimisations in ES-Push (NRT dashboards) to reduce the size of stored data and reducing pipeline
failures with better exception handling.

Connaizen

Data Engineer

Oct 2017 – Sep 2018 · 11 mos · Gurugram, Haryana, India

Responsible for the maintenance, improvement, cleaning, and manipulation of
data from bank's business and analytics databases using PySpark.
Created RESTful API ( in Django, DRF, Pandas, Numpy) for Campaign Management
Tool.

Py technology pvt. ltd.

Software Developer

Sep 2016 – Oct 2017 · 1 yr 1 mo · Gurugram, Haryana, India

Developed an ERP solution for Snapdeal and EComm logistics from Scratch(Python, Django,
Javascript, Jquery, Mysql, Google Chart).
Created complex modules of reporting, challans generation, route planning of vehicles, pick up
trip,shipment and Pallet handling.
API Integration with various 3rd party systems for shipment manifest.
Real time tracking of the shipment(Kafka, PySpark,MongoDB).