Saosri Ghosal

Machine Learning Engineer

West Bengal, India7 yrs 2 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Reduced query times from 72 hours to 10 minutes.
Built deep learning models for revenue forecasting.
Expertise in data engineering and cloud technologies.

Stackforce AI infers this person is a Data Engineer with strong expertise in SaaS and cloud-based solutions.

Contact

Skills

Core Skills

Machine LearningData EngineeringData StreamingData ProcessingData AnalyticsCloud ComputingEtl Processes

Other Skills

Data ArchitectsApache KafkaApache SparkApache AirflowPySparkSQLDatabricksKafkaDBTTableauDeep LearningGoogle CloudBigQueryTerraformAzure Data Factory

About

As a Data Engineer II at Atlassian, our team has significantly advanced data processing, exemplified by reducing query times from 72 hours to 10 minutes. With a focus on data integrity and efficiency, I've been pivotal in implementing API-driven data management solutions and leveraging cutting-edge technologies like Databricks and Kafka for enhanced data streaming and ingestion. Currently pursuing an M.Tech in Computer Science with a specialization in Data Science at IIT Hyderabad, I am enriching my technical prowess in areas such as machine learning and deep learning. My educational journey is complemented by professional certifications, including Azure Developer Associate, fortifying my competencies in Azure Databricks and Spark ML, essential for innovative data solutions.

Experience

7 yrs 2 mos

Total Experience

2 yrs 1 mo

Average Tenure

11 mos

Current Experience

Target

Senior Machine Learning Engineer - External Search Engine (SEO Data)

Jul 2025 – Present · 11 mos · Bengaluru, Karnataka, India · Hybrid

Data ArchitectsApache KafkaMachine LearningData Engineering

Atlassian

Data Engineer II - Senior Associate

Sep 2022 – Jun 2025 · 2 yrs 9 mos · Bengaluru, Karnataka, India · Remote

Developed a framework using PySpark and SQL to extract revenue data for
products like Jira and Confluence, reducing query runtime from 72 hours to
10 minutes through optimization techniques.
Leveraged Databricks Structured Streaming for near-real-time stream
ingestion, integrating Kafka with Delta Lake built top on AWS S3 buckets.
Implemented APIs for seamless data extraction and ingestion into AWS S3
buckets, ensuring near-real-time processing while maintaining accuracy and
availability.
Transformed data using DBT for downstream analytics.
Built a deep learning model to forecast weekly revenue trends for Atlassian
Marketplace plugins, publishing insights to Tableau for actionable decision-
making.
Designed dimensional models for job migration and analytical use cases.
Developed data architecture from scratch, collaborating daily with
stakeholders to ensure successful requirements gathering to delivery.
Established robust frameworks using Unity Catalog for governance and
validation, ensuring data consistency and compliance.
Utilized Apache Airflow to orchestrate transformations, enabling smooth,
reliable workflows with almost zero failures in production.
Provided timely resolution for database-related issues, ensuring minimal
downtime and seamless operation to downstream users during on call and
beyond.

Apache SparkApache AirflowPySparkSQLDatabricksKafka+4

Flipkart

Data Engineer-I (Fin-tech & Payments Group - Flipkart Pay Later )

Dec 2021 – Sep 2022 · 9 mos · Bengaluru, Karnataka, India

Developed a near real-time framework using PySpark to automate CIBIL score retrieval, enhancing efficiency.
Created multiple feature pipelines to assess customer affordability, optimizing processing time by 70-80%.
Migrated on-premise infrastructure to Google Cloud, leveraging services like BigQuery and Terraform for provisioning.
Actively participated in code reviews and discussions to optimize Spark performance and ensure quality.

PySparkGoogle CloudBigQueryTerraformData EngineeringCloud Computing

Tata digital

Data Engineer - (Tata Neu Super Coin App - Retail R&D)

Mar 2019 – Dec 2021 · 2 yrs 9 mos · Kolkata, West Bengal, India

Designed and implemented logical and physical data models for the Tata Neu Super Coin App.
Developed efficient ETL processes using Apache Spark and Azure Data Factory, significantly enhancing data loading efficiency.
Optimized batch processing times, reducing fact processing from 8 hours to 2 hours, improving overall system performance.

Apache SparkAzure Data FactoryData EngineeringETL Processes

Webel informatics ltd.

Internship Trainee

Jul 2018 – Dec 2018 · 5 mos · Kolkata, West Bengal, India

Developed robust backend microservices using Spring Boot, enhancing application performance and scalability.
Designed and implemented dynamic front-end interfaces with Node.js, EJS templates, JavaScript, and CSS.
Conducted comprehensive testing of web pages to ensure seamless user experience and functionality.