Jatan sahu

Data Engineer

Bengaluru, Karnataka, India6 yrs 1 mo experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in developing scalable ETL pipelines.
  • Strong background in machine learning and data engineering.
  • Proficient in optimizing data processing and infrastructure.
Stackforce AI infers this person is a Data Engineering and Machine Learning specialist in the SaaS industry.

Contact

Skills

Core Skills

Data EngineeringMachine LearningData ManagementComputer VisionLeadership

Other Skills

AWSAWS GlueAWS LambdaAWS SageMakerAgile MethodologiesAirflowAlgorithmsAmazon S3Amazon Web Services (AWS)Apache FlinkApache KafkaApache SparkArtificial Neural NetworksAthenaBig Data

About

Skilled Data Science Professional with hands-on experience in developing ETL pipelines, Machine Learning Models, and optimizing system performance. Proficient in Python, SQL, Spark, and AWS big data technologies. Strong background in data science concepts, including Machine Learning, data warehousing, and data modeling. Adept at working closely with data scientists, analysts, and other stakeholders to understand and meet data needs, ensuring data reliability, efficiency, and quality.

Experience

6 yrs 1 mo
Total Experience
1 yr 4 mos
Average Tenure
1 yr 6 mos
Current Experience

Yulu

2 roles

Data Engineer

Dec 2024Present · 1 yr 6 mos · On-site

  • Designed and maintained ETL pipelines on Amazon EMR & EC2 for scalable data processing.
  • Led a database consolidation project using DuckDB, integrating multi-source data into a unified data lake.
  • Migrated pipelines from Hudi to Iceberg and optimized Spark pipelines by converting them to Python, achieving significant cost savings.
  • Automated DAG creation & ingestion flows in Airflow for incremental and dimension tables, reducing manual effort.
  • Built unified ETL pipelines for PostgreSQL, MySQL, & Kafka, cutting data latency.
  • Configured dynamic EMR clusters with spot instances and Airflow orchestration, enabling reliable execution of 150+ daily ETL jobs.
  • Developed real-time allocation systems that improved issue resolution and operational efficiency.
  • Automated manual operations using optimization models, boosting overall capacity.
  • Enhanced allocation optimization with pooling, scheduling algorithms, and scoring logic, driving improvements in efficiency and throughput.
Data MaintenancePython (Programming Language)PySparkAirflowSQLHudi+7

Data: Intern

Sep 2024Dec 2024 · 3 mos · On-site

  • Working behind the scenes, managing and optimizing the vast amount of data that power our operations, transitioning from batch processing to real-time stream processing to deliver smooth and efficient services to our users.
  • Optimizing data infrastructure, transitioning from batch to real-time stream processing.
  • Developed a crucial sweep report script for bike operations.
  • Automated e-way bill invoicing, saving manual effort for the inventory team.
  • Managing Airflow, Cron, Spark jobs, and ingestion pipelines.
  • Improved data ingestion efficiency using DBT Hub.
  • Building NLQ-powered data generation with DS team using LLMs for non-tech users w
  • Developing a Customer Support Chatbot (Deepseek-r1, Ollama) & automating query analysis with DS team.
  • Exploring Apache Iceberg for data optimization.
PythonSQLMetaBaseAthenaOptimizationKotlin+5

Growexx

Junior Data Engineer

Jan 2024Jul 2024 · 6 mos · Ahmedabad, Gujarat, India · On-site

  • Developed a system to integrate data from multiple sources, ensuring high accuracy and reliability.
  • Developed a robust cloud-based ETL pipelines, reducing data processing time by 40%
  • Utilized Airflow for orchestration and AWS for data storage and processing.
  • Evaluated and recommended new technologies to enhance data processing capabilities.
  • Tech Learned : SQL, Python, Talend, Spark, AWS, Azure, Snowflake, Airflow, Tableau, ML, Prompt Engineering, LLMs, Kafka.
Python (Programming Language)Agile MethodologiesScrumMathematicsSQLPostgreSQL+13

Dhirubhai ambani institute of information and communication technology

Research Intern

Jan 2023Dec 2023 · 11 mos · Gandhinagar, Gujarat, India · On-site

  • Implemented automated fabric defect detection system using computer vision techniques in a team of 5.
  • Aggregated unstructured data from 20+ sources and determined which Deep Learning models perform most effectively.
  • Optimize false positives and improve accuracy, resulting in enhanced product quality using CNN.
  • Worked on making algorithmic trading :
  • ▪️ Conducted detailed literature review to identify popular methods for stock index prediction.
  • ️▪️ Developed a stock trading strategy using ARIMA, LSTM, and CNN, with 68% prediction accuracy for index trends.
Computer Vision

Business club - daiict

Core member

Oct 2022Sep 2023 · 11 mos · Gandhinagar, Gujarat, India · On-site

  • We have organised many events like echai and workshops on our campus with my teammates. I also worked on some business ideas and implemented my data science skills to take business decisions. Currently handling social media handles of the business club.
Team LeadershipPublic SpeakingBusiness AnalysisProblem SolvingData AnalysisLeadership

Niti tantra

Graphic designer

May 2021Jun 2021 · 1 mo

National service scheme

Student Volunteer

Jul 2018Jun 2021 · 2 yrs 11 mos · Jabalpur, Madhya Pradesh, India · On-site

Social ServicesLeadership

Education

Dhirubhai Ambani University

MSc. Data Science

Jul 2022Jul 2024

St. aloysius College, Jabalpur

Bachelor's degree — Computer Science

Jan 2018Jan 2021

Stackforce found 100+ more professionals with Data Engineering & Machine Learning

Explore similar profiles based on matching skills and experience