Subodh Joshi

Data Scientist

Pune, Maharashtra, India5 yrs 8 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • 5+ years of experience in data engineering.
  • Expert in building robust ETL workflows.
  • Proven track record of optimizing data models.
Stackforce AI infers this person is a Data Engineer specializing in cloud-based data solutions for enterprise applications.

Contact

Skills

Core Skills

Data EngineeringEtlCloud SolutionsData Science

Other Skills

ADFSADLS Gen2AI /MLAPI DevelopmentAWS GlueAgile MethodologiesAnalytical SkillsAnalyticsApache AirflowApache SparkAzure CloudAzure Data FactoryAzure Data LakeAzure DatabricksAzure Logic Apps

About

As a passionate Data Engineer, I thrive on transforming raw data into meaningful insights for better business decisions. With 5+ years of experience in designing and scaling data pipelines, I specialize in building robust ETL workflows using Python, SQL, Azure and AWS to enable analytics, reporting, and machine learning across large organizations. My expertise spans end-to-end data architecture, cloud migration, and optimizing data models to unlock actionable value for diverse teams—data analysts, scientists, and executives alike. I am driven by problem-solving: whether identifying bottlenecks in existing data flows or architecting solutions for complex business challenges, I love rolling up my sleeves to deliver real impact. My work has resulted in streamlined data ingestion, reduced costs, and improved data accessibility. I take pride in high uptime, continuous improvement, and mentoring junior engineers to uphold best practices in data quality and integrity. I am eager to connect with organizations aiming to unlock the business potential of their data. If your team needs an engineer who brings technical excellence, creativity, and strategic thinking to every project, let’s talk. Core competencies: Data Engineering | Data Pipeline Design | ETL | SQL | Python | AWS | Azure | Data Warehousing | GIT | Machine Learning Integration | Cloud Solutions | Business Intelligence

Experience

5 yrs 8 mos
Total Experience
1 yr 5 mos
Average Tenure
2 yrs 3 mos
Current Experience

Infosys

Senior Data Engineer

Mar 2024Present · 2 yrs 3 mos · India · On-site

  • Designed and deployed data pipelines processing multi-terabyte datasets using Python and Apache Spark, cutting processing time by 40%.
  • Built ETL frameworks in Azure Data Factory and AWS Glue, improving data ingestion reliability and scalability by 30%.
  • Automated reporting solutions with SQL and Power BI, enhancing executive decision-making.
  • Led migration of legacy databases to cloud platforms (Azure/AWS), optimizing cost and performance.
  • Collaborated with Data Scientists and Analysts to support predictive modeling and real-time analytics.
PythonApache SparkAzure Data FactoryAWS GlueSQLPower BI+2

Nuvento inc

Data Engineer

Jan 2023Nov 2023 · 10 mos

  • Understand the Business Logic and implement required services.
  • Worked on the migration and configuration of on-premise data solutions to Azure Cloud. Resources Used: CosmosDB, Azure Data Factory, Azure Databricks, PySpark, ADLS Gen2, Azure SQL Server.
  • Experience with Azure Databricks notebooks to consume data from ADLS Gen2. Process it, and ingest processed data to CosmosDB. Databricks Runtime: 6.5, Spark: 2.4.3, Scala: 2.11.
  • Designed and developed a Spark-based framework to process the data present in HDFS and create reports for the regulatory team.
  • Familiar with Azure networking and security resources like Azure Vnet, Azure KeyVault, and Network Interfaces.
  • Develop interactive and insightful dashboards in Power BI to visualize healthcare metrics, performance, and relevant KPIs.
  • Create detailed documentation and conduct knowledge transfer sessions for stakeholders and users.
Azure CloudCosmosDBAzure Data FactoryAzure DatabricksPySparkADLS Gen2+4

Tiger analytics

Data Engineer

Jun 2022Dec 2022 · 6 mos · Remote

  • Requirement gathering, data analysis, customer interaction for requirement clarifications, planning the estimate, and assigning of user stories to the team for development and unit testing
  • Use Azure Data Factory for the ETL(Extract, Transform, and Load) process.
  • Make the pipeline Configuration driven.
  • Work on Azure Databricks to help stakeholders with their everyday data transformation
  • Develop and maintain Power BI reports as per requirement, handle user access and security.
Azure Data FactoryAzure DatabricksPower BIData EngineeringETL

Persistent systems

Data Scientist / Engineer

May 2020Jun 2022 · 2 yrs 1 mo · Pune, Maharashtra, India · Remote

  • Involved in Data Collection, Preparation and integration, Infrastructure Setup, Performance Optimization, and Model Deployment.
  • Interpreting data, and analyzing results using statistical techniques
  • Design Python script with respect to ensuring functionalities meets customer requirements
  • Acquiring data from primary or secondary data sources and maintaining databases
  • Browse and analyze enterprise databases to simplify and improve product development and business processes.
  • Monitor Azure ML models to forecast historical data patterns, customer behavior, usage trends, and service interactions.
  • Evaluate and fine-tune models for accuracy.
  • Coordinate with various technical/functional teams to implement models and monitor results
Data CollectionInfrastructure SetupPerformance OptimizationModel DeploymentPythonAzure ML+2

Education

University of Mumbai

Bachelor's degree

Jan 2015Jan 2020

Stackforce found 100+ more professionals with Data Engineering & Etl

Explore similar profiles based on matching skills and experience