Siddhant Kapur

Software Engineer

Gurugram, Haryana, India5 yrs 4 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

4 years of experience in Data Engineering.
Expertise in Big Data frameworks and AWS services.
Proven track record in data integration and visualization.

Stackforce AI infers this person is a Data Engineer with expertise in Pharmaceutical and Supply Chain sectors.

Contact

siddhantkapur98@gmail.com LinkedIn

Skills

Core Skills

Data EngineeringBig DataBackend DevelopmentData VisualizationData Integration

Other Skills

API TestingAPIFYAPIsAWS GlueAWS LambdaAWS OpenSearchAgile MethodologiesAmazon S3Amazon Web Services (AWS)Apache SolrApache SparkAzure Data FactoryAzure Data LakeAzure DatabricksAzure DevOps

About

Data Engineer Experience : 4 years Software Engineering Experience: 5 years Total Experience : 5 Years • Domain: Pharmaceutical and Supply Chain. • Big Data Framework: Spark. • Databases: MSSQL, MySQL, PostgreSQL, MongoDB. • AWS Services: S3, Glue, Lambda, IAM, RDS, OpenSearch, DynamoDB, API Gateway,SES. • Azure Services: Azure Data Factory, Azure Databricks, Azure SQL Server, Azure Blob Storage, Azure Data Lake Storage Gen 2, Azure Logic Apps. • Visualization Tools: PowerBI and SSRS. • Data Integration Tools: ADF, SSIS. • Programming Languages: Python • Frameworks: FastAPI, Flask. • DevOps & Lifecycle Management : JIRA, GitLab. • Vibe Coding Tools - Lovable & Cursor

Experience

5 yrs 4 mos

Total Experience

2 yrs 8 mos

Average Tenure

4 yrs 1 mo

Current Experience

Zs

3 roles

Senior Engineer

Promoted

Aug 2024 – Present · 1 yr 10 mos · Gurugram, Haryana, India · Hybrid

Business Technology Associate Consultant

Jul 2024 – Jul 2024 · 0 mo · Gurugram, Haryana, India · Hybrid

Business Technology Solutions Associate

Mar 2022 – Jun 2024 · 2 yrs 3 mos · Gurugram, Haryana, India · Hybrid

EEA RTS Search Crawler: As a Backend Python Developer & Data Engineer, I developed APIs, implemented web crawling, and managed data ingestion using JSON, XML, and Delta formats. I designed scalable data models and pipelines with PySpark, Hadoop, and Apache Kafka, and optimized PostgreSQL and AWS Glue for metadata management and storage. I established CI/CD pipelines with GitLab and utilized web scraping tools like APIFY.
Intelligent Search Platform: Improved internal search accuracy and streamlined onboarding of SPO sites’ toplinks. Developed and maintained the Total Experience Dashboard on AWS Opensearch, enhancing data visualization and user experience.
Data Marketplace: Led the Orphaned Databases Cleanup project, used ChatGPT and Langchain for asset description, and implemented Auto-Tag generation. Designed technical solutions, onboarded new team members, and managed deployment scripts.
Skills: Expertise in Hadoop, Apache Spark, data lakes, ETL pipelines, and data integration. Advanced in Langchain, FastAPI, and DevOps practices.

HadoopApache Sparkdata lakesETL pipelinesdata integrationLangchain+4

Infosys

System Engineer

Nov 2020 – Feb 2022 · 1 yr 3 mos · Pune, Maharashtra, India

As a Data Engineer, created a solution to tag a workload to agent based on work assignments.
Handling of big data from Azure Data Lake as source and processing in Azure Data Bricks with
processed data published to Azure Sql Server using pipeline in Azure Data Factory.
Provided a solution to schedule an automated Job for agents based on Severity, Workload, Region,
Holidays, and many other criteria.
Migrated on-prem database to Azure Sql Server Database with all stored procedures, functions and
logic computed in Azure Data bricks
As a part of Azure Devops Team, Deployed and tested code in higher environments.
Created and Analyzed Data Validation and Data Quality reports in PowerBI