Pradeep Yogesh

CEO

Bengaluru, Karnataka, India12 yrs 2 mos experience

Key Highlights

  • Expert in building scalable data pipelines.
  • Proficient in big data technologies like Kafka and PySpark.
  • Strong background in web development with Django.
Stackforce AI infers this person is a Data Engineering specialist with expertise in SaaS and Embedded Systems.

Contact

Skills

Core Skills

Big DataData EngineeringSearch TechnologiesWeb DevelopmentWeb CrawlingEmbedded Systems

Other Skills

KafkaSolrMySQLRedisDynamoDBAirflowHivePySparkSqoopDjangoPythonSeleniumPuppeteerCeleryRabbitMQ

About

Experienced Data Engineer with a demonstrated history of working in the information technology and services industry. Skilled in big data,full stack development using python and Java,web crawlers, Linux. Strong information technology professional with a Master of Technology (M.Tech.) focused in Information Technology from International Institute of Information Technology – Bangalore.

Experience

12 yrs 2 mos
Total Experience
3 yrs 7 mos
Average Tenure
1 yr 4 mos
Current Experience

Dataweave

5 roles

Technical Architect

Feb 2025Present · 1 yr 4 mos · Bengaluru, Karnataka, India · Hybrid

Technical Lead

Promoted

Oct 2021Sep 2022 · 11 mos

  • ● Deployed‌ ‌and‌ ‌managed‌ ‌the‌ ‌Kafka‌ ‌cluster‌ ‌to‌ ‌handle‌ ‌high‌ ‌volume‌ ‌data‌ ‌from‌‌
  • distributed‌ ‌crawls.‌ ‌
  • ● Deployed‌ ‌and‌ ‌managed‌ ‌the‌‌ Solr‌‌cloud‌‌ which‌‌ is‌‌ used‌‌ as‌‌ a‌‌ search‌‌ engine‌‌ to‌‌ identify‌‌
  • similar‌ ‌documents‌ ‌across‌ ‌sources.‌ ‌
  • ● Designed‌ ‌and‌ ‌developed‌ ‌mysql‌ ‌database‌ ‌that‌ ‌is‌ ‌used‌ ‌in‌ ‌our‌ ‌internal‌ ‌dashboards‌ ‌/‌‌
  • API.‌ ‌Used‌ ‌Redis‌ ‌as‌ ‌a‌ ‌cache‌ ‌for‌ ‌better‌ ‌performance.‌ ‌
  • ● Worked‌‌ on‌‌ dynamodb‌‌ for‌‌ query in historical‌‌ data‌‌ in‌‌ real‌‌time,‌‌designed‌‌ schema‌‌ and‌‌
  • other‌ ‌policies‌ ‌like‌ ‌retention‌ ‌period, autoscaling‌ ‌etc.‌ ‌
  • ● Build‌‌ Airflow‌‌ DAGs‌‌ for‌‌ various‌‌ automation‌‌ workflows‌‌ that‌‌ involved‌‌ scheduling‌‌ crawls,‌‌
  • transformation‌ ‌of‌ ‌the‌ ‌crawled‌ ‌data‌ ‌and‌ ‌loading‌ ‌it into‌ ‌client-specific ‌datastores/paths.‌ ‌
  • ● Built‌ ‌Hive‌ ‌data‌ ‌warehouse‌ ‌using‌ ‌AWS‌ ‌S3‌ ‌and‌ ‌Qubole‌ ‌for‌ ‌historical‌ ‌data‌ ‌analysis.‌‌
  • Involved‌ ‌in‌ ‌data‌ ‌modeling ‌and‌ ‌query‌ ‌level‌ ‌optimizations ‌for‌ ‌better‌ ‌performance.‌ ‌
  • ● Built‌ ‌efficient‌ ‌ETL‌ ‌pipelines‌ ‌using‌ ‌pyspark‌ ‌and‌ ‌integrated‌ ‌them ‌with‌ ‌the‌ ‌workflow‌‌
  • management‌ ‌tool‌ ‌airflow.‌ ‌Worked‌ ‌on‌ ‌different‌ ‌file‌ ‌formats‌ ‌like‌ ‌JSON,‌ ‌parquet,‌ ‌orc‌ ‌etc.‌ ‌
  • ● Built‌ ‌efficient‌‌ sql‌‌ ingestion‌‌ system‌‌ using‌‌ sqoop‌‌ incremental‌‌ job‌‌ to‌‌ ingest‌‌ mysql‌‌ data‌‌
  • to‌ ‌hive‌ ‌data‌ ‌warehouse‌ ‌for‌ ‌the‌ ‌analytical‌ ‌purpose.
KafkaSolrMySQLRedisDynamoDBAirflow+5

Senior Data Engineer

Promoted

Apr 2019Oct 2021 · 2 yrs 6 mos

  • ● Involved‌ ‌in‌ ‌designing‌ ‌and‌ ‌building‌ ‌the‌ ‌config‌ ‌manager‌‌ using‌‌ django‌‌ which‌‌ help s‌‌to‌‌
  • store‌‌ the‌‌ dynamic‌‌ crawl‌‌ and‌‌ extraction‌‌ information‌‌ of‌‌ a‌‌ website.‌‌Exposed‌‌ API‌‌ for‌‌ the‌‌
  • data‌ ‌to‌ ‌the‌ ‌internal‌ ‌systems.‌ ‌
  • ● Involving‌ ‌in‌ ‌designing‌ ‌and‌ ‌building‌ ‌an internal‌ ‌dashboard‌ ‌using‌ ‌Django ‌that‌ ‌is‌‌used‌‌to‌‌
  • monitor‌ ‌the‌ ‌crawl‌ ‌jobs‌ ‌and‌ ‌take‌ ‌actions‌ ‌based‌ ‌on‌ ‌the‌ ‌crawl‌ ‌health.‌ ‌
  • ● Involved‌ ‌in‌ ‌building‌ ‌proxy‌ ‌API ‌for‌ ‌crawlers,‌ ‌optimized ‌the‌ ‌performance‌ ‌using‌ ‌Redis
  • cache‌ ‌for‌ ‌the‌ ‌data‌ ‌served
DjangoRedisWeb Development

Data Engineer

Jul 2017Apr 2019 · 1 yr 9 mos

  • ● Involved‌ ‌in‌ ‌building‌ ‌a‌ ‌distributed‌ ‌crawling‌ ‌framework‌ ‌that‌ ‌is‌ ‌capable‌ ‌of crawling data from the web at a very high scale.
  • ● Used‌ ‌python‌ ‌request‌ ‌module,‌ ‌selenium,‌ ‌puppeteer‌ ‌for‌ ‌crawling.‌ ‌Scheduling‌ ‌of‌‌
  • multiple‌ ‌crawling‌ ‌jobs‌ ‌were‌ ‌handled‌ ‌through‌ ‌celery‌ ‌along‌ ‌with‌ ‌Rabbitmq.‌
  • ● The crawler has an inbuilt retry mechanism based on the success rate.
  • Real time health of the crawling jobs were monitored from EFK (Elastic Search, Fluentd, Logstash)
PythonSeleniumPuppeteerCeleryRabbitMQWeb Crawling

Intern

Jan 2017Jun 2017 · 5 mos

Cisco

Senior Data Engineer

Sep 2022Feb 2025 · 2 yrs 5 mos · Bengaluru, Karnataka, India

  • Part of Cisco's "network academy" online education platform

Tata consultancy services

System Engineer

Aug 2012Jun 2015 · 2 yrs 10 mos · Bengaluru Area, India

  • Project Name: Jaguar Land Rover Infotainment
  • Project Description: To develop Jaguar Land Rover Next Generation in vehicle Infotainment
  • (NGI) system.
  • Technology: Embedded C++
  • Tools Used: Rhapsody, GLStudio
  • Project Role:
  • Part of Tuners HMI Team.
  • Developed GUI and Business Logic for tuner features DAB and Radio.
  • Created a Linux software object which can be launched from the main application.
  • Performed Manual Testing for the applications developed, involves Unit Testing, Code
  • Testing and Module Testing.
  • Performed debugging and bug fixing for the code defects.
  • Analyzed the defects raised in Clear Quest using DLT logs and provided the solution.
Embedded C++LinuxEmbedded Systems

Education

International Institute of Information Technology Bangalore

Master of Technology (M.Tech.) — Information Technology

Jan 2015Jan 2017

B. M. S. College of Engineering

Bachelor of Engineering - BE

Jan 2008Jan 2012

Rotary English School, Sakleshpur.

Jan 2006Present

sadvidya semi residential pu college, Mysore

Stackforce found 100+ more professionals with Big Data & Data Engineering

Explore similar profiles based on matching skills and experience