Pradeep Yogesh

CEO

Bengaluru, Karnataka, India12 yrs 2 mos experience

Key Highlights

Expert in building scalable data pipelines.
Proficient in big data technologies like Kafka and PySpark.
Strong background in web development with Django.

Stackforce AI infers this person is a Data Engineering specialist with expertise in SaaS and Embedded Systems.

Contact

Skills

Core Skills

Big DataData EngineeringSearch TechnologiesWeb DevelopmentWeb CrawlingEmbedded Systems

Other Skills

KafkaSolrMySQLRedisDynamoDBAirflowHivePySparkSqoopDjangoPythonSeleniumPuppeteerCeleryRabbitMQ

About

Experienced Data Engineer with a demonstrated history of working in the information technology and services industry. Skilled in big data,full stack development using python and Java,web crawlers, Linux. Strong information technology professional with a Master of Technology (M.Tech.) focused in Information Technology from International Institute of Information Technology – Bangalore.

Experience

12 yrs 2 mos

Total Experience

3 yrs 7 mos

Average Tenure

1 yr 4 mos

Current Experience

Dataweave

5 roles

Technical Architect

Feb 2025 – Present · 1 yr 4 mos · Bengaluru, Karnataka, India · Hybrid

Technical Lead

Promoted

Oct 2021 – Sep 2022 · 11 mos

● Deployed‌ ‌and‌ ‌managed‌ ‌the‌ ‌Kafka‌ ‌cluster‌ ‌to‌ ‌handle‌ ‌high‌ ‌volume‌ ‌data‌ ‌from‌‌
distributed‌ ‌crawls.‌ ‌
● Deployed‌ ‌and‌ ‌managed‌ ‌the‌‌ Solr‌‌cloud‌‌ which‌‌ is‌‌ used‌‌ as‌‌ a‌‌ search‌‌ engine‌‌ to‌‌ identify‌‌
similar‌ ‌documents‌ ‌across‌ ‌sources.‌ ‌
● Designed‌ ‌and‌ ‌developed‌ ‌mysql‌ ‌database‌ ‌that‌ ‌is‌ ‌used‌ ‌in‌ ‌our‌ ‌internal‌ ‌dashboards‌ ‌/‌‌
API.‌ ‌Used‌ ‌Redis‌ ‌as‌ ‌a‌ ‌cache‌ ‌for‌ ‌better‌ ‌performance.‌ ‌
● Worked‌‌ on‌‌ dynamodb‌‌ for‌‌ query in historical‌‌ data‌‌ in‌‌ real‌‌time,‌‌designed‌‌ schema‌‌ and‌‌
other‌ ‌policies‌ ‌like‌ ‌retention‌ ‌period, autoscaling‌ ‌etc.‌ ‌
● Build‌‌ Airflow‌‌ DAGs‌‌ for‌‌ various‌‌ automation‌‌ workflows‌‌ that‌‌ involved‌‌ scheduling‌‌ crawls,‌‌
transformation‌ ‌of‌ ‌the‌ ‌crawled‌ ‌data‌ ‌and‌ ‌loading‌ ‌it into‌ ‌client-specific ‌datastores/paths.‌ ‌
● Built‌ ‌Hive‌ ‌data‌ ‌warehouse‌ ‌using‌ ‌AWS‌ ‌S3‌ ‌and‌ ‌Qubole‌ ‌for‌ ‌historical‌ ‌data‌ ‌analysis.‌‌
Involved‌ ‌in‌ ‌data‌ ‌modeling ‌and‌ ‌query‌ ‌level‌ ‌optimizations ‌for‌ ‌better‌ ‌performance.‌ ‌
● Built‌ ‌efficient‌ ‌ETL‌ ‌pipelines‌ ‌using‌ ‌pyspark‌ ‌and‌ ‌integrated‌ ‌them ‌with‌ ‌the‌ ‌workflow‌‌
management‌ ‌tool‌ ‌airflow.‌ ‌Worked‌ ‌on‌ ‌different‌ ‌file‌ ‌formats‌ ‌like‌ ‌JSON,‌ ‌parquet,‌ ‌orc‌ ‌etc.‌ ‌
● Built‌ ‌efficient‌‌ sql‌‌ ingestion‌‌ system‌‌ using‌‌ sqoop‌‌ incremental‌‌ job‌‌ to‌‌ ingest‌‌ mysql‌‌ data‌‌
to‌ ‌hive‌ ‌data‌ ‌warehouse‌ ‌for‌ ‌the‌ ‌analytical‌ ‌purpose.

KafkaSolrMySQLRedisDynamoDBAirflow+5

Senior Data Engineer

Promoted

Apr 2019 – Oct 2021 · 2 yrs 6 mos

● Involved‌ ‌in‌ ‌designing‌ ‌and‌ ‌building‌ ‌the‌ ‌config‌ ‌manager‌‌ using‌‌ django‌‌ which‌‌ help s‌‌to‌‌
store‌‌ the‌‌ dynamic‌‌ crawl‌‌ and‌‌ extraction‌‌ information‌‌ of‌‌ a‌‌ website.‌‌Exposed‌‌ API‌‌ for‌‌ the‌‌
data‌ ‌to‌ ‌the‌ ‌internal‌ ‌systems.‌ ‌
● Involving‌ ‌in‌ ‌designing‌ ‌and‌ ‌building‌ ‌an internal‌ ‌dashboard‌ ‌using‌ ‌Django ‌that‌ ‌is‌‌used‌‌to‌‌
monitor‌ ‌the‌ ‌crawl‌ ‌jobs‌ ‌and‌ ‌take‌ ‌actions‌ ‌based‌ ‌on‌ ‌the‌ ‌crawl‌ ‌health.‌ ‌
● Involved‌ ‌in‌ ‌building‌ ‌proxy‌ ‌API ‌for‌ ‌crawlers,‌ ‌optimized ‌the‌ ‌performance‌ ‌using‌ ‌Redis
cache‌ ‌for‌ ‌the‌ ‌data‌ ‌served

DjangoRedisWeb Development

Data Engineer

Jul 2017 – Apr 2019 · 1 yr 9 mos

● Involved‌ ‌in‌ ‌building‌ ‌a‌ ‌distributed‌ ‌crawling‌ ‌framework‌ ‌that‌ ‌is‌ ‌capable‌ ‌of crawling data from the web at a very high scale.
● Used‌ ‌python‌ ‌request‌ ‌module,‌ ‌selenium,‌ ‌puppeteer‌ ‌for‌ ‌crawling.‌ ‌Scheduling‌ ‌of‌‌
multiple‌ ‌crawling‌ ‌jobs‌ ‌were‌ ‌handled‌ ‌through‌ ‌celery‌ ‌along‌ ‌with‌ ‌Rabbitmq.‌
● The crawler has an inbuilt retry mechanism based on the success rate.
Real time health of the crawling jobs were monitored from EFK (Elastic Search, Fluentd, Logstash)

PythonSeleniumPuppeteerCeleryRabbitMQWeb Crawling

Intern

Jan 2017 – Jun 2017 · 5 mos

Cisco

Senior Data Engineer

Sep 2022 – Feb 2025 · 2 yrs 5 mos · Bengaluru, Karnataka, India

Part of Cisco's "network academy" online education platform

Tata consultancy services

System Engineer

Aug 2012 – Jun 2015 · 2 yrs 10 mos · Bengaluru Area, India

Project Name: Jaguar Land Rover Infotainment
Project Description: To develop Jaguar Land Rover Next Generation in vehicle Infotainment
(NGI) system.
Technology: Embedded C++
Tools Used: Rhapsody, GLStudio
Project Role:
Part of Tuners HMI Team.
Developed GUI and Business Logic for tuner features DAB and Radio.
Created a Linux software object which can be launched from the main application.
Performed Manual Testing for the applications developed, involves Unit Testing, Code
Testing and Module Testing.
Performed debugging and bug fixing for the code defects.
Analyzed the defects raised in Clear Quest using DLT logs and provided the solution.