Suraj T. — Software Engineer

🚀 Suraj Tripathi - Unleashing Data Magic 🚀 📧 Email: suraj.mnitjaipur@gmail.com 🎓 Education: Malaviya National Institute of Technology Bachelor of Technology (Electronics and Communication Engineering) Class of 2016 | CGPA: 7.11 🌟 Skills: Python | SQL | Pyspark | Microsoft Fabric | ETL | Apache Spark Streaming | Apache Kafka | Azure Devops | Large Language Models | Apache Airflow | Machine Learning | AWS | Dataiku | Langchain | Apache Dagster | Azure | CI/CD | Synapse | Event Hub 🌐 Professional Data Sorcerer At Microsoft, I'm more than a Data Engineer; I'm a sorcerer, conjuring insights from the data realm. My wand of choice? Python, Pyspark, and Azure Synapse. Here's a glimpse of my magic: ✨ Data Streaming Maven: I engineered an event streaming solution, wielding Pyspark on Synapse, and processed over 10 million events daily from Azure Event Hub to Delta Lake. The result? A phenomenal 40% boost in data processing efficiency. ✨ Code Wizard: I coaxed Langchain and OpenAI to craft Pyspark spells from English prompts, cutting data validation time by 60%. ✨ Azure Architect: With ARM templates, I summoned Synapse workspaces, Key Vaults, and ADLS, speeding up project setup by 60%. ✨ Logging Enchanter: My spells with Azure Log Analytics made issue tracking faster than a blink, slashing incident response time by 25%. ✨ Data Alchemist: In a cross-functional alliance, we optimized data architecture, and query execution time bowed before us, decreasing by 30%. 🌍 Freelance Cloud Voyager As a Cloud Data Engineer consultant, I've roamed the digital landscapes, specializing in Amazon S3 migrations, data partitioning, ETL enhancements, and time-bending project completions. 🌆 Former Genpact Innovator At Genpact, I orchestrated data migrations, automated ETL with Dataiku's secret potions, migrated thousands of databases to Snowflake, and conjured processing speed enhancements with Pyspark. 🔮 Amdocs Time Bender In my time at Amdocs, I mastered Telecom data, reduced runtimes by 90%, and managed the complexities of Ordering, Billing, and Migration. 🛠️ Personal Projects - Data Artisan In my creative laboratory, I crafted a Breast Cancer Classification masterpiece on Udemy. I achieved a remarkable 98% accuracy using SVM models, adding data normalization and grid search for a touch of genius. Let's connect and weave some data magic together. Whether you seek insights or share a passion for data wizardry, I'm always up for an enchanting conversation.

Stackforce AI infers this person is a Data Engineer specializing in Cloud Computing and Data Analytics across various industries.

Location: Gurugram, Haryana, India

Experience: 9 yrs 6 mos

Skills

Data Engineering
Cloud Computing
Data Analytics
Machine Learning
Data Analysis

Career Highlights

Engineered solutions processing over 10 million events daily.
Achieved a 40% boost in data processing efficiency.
Spearheaded migration of 8,000 databases to Snowflake.

Work Experience

Emids

Lead Data Engineer (1 yr)

Yieldmo

Data Engineer (1 yr 2 mos)

Upwork

Cloud Data Engineer Consultant (6 mos)

Microsoft

Data Engineer (1 yr 8 mos)

Genpact

Data Engineer (1 yr)

iNeuron.ai

Machine Learning Intern (2 mos)

Amdocs

Data Engineer (4 yrs 8 mos)

Education

Bachelor of Technology (B.Tech.) at Malaviya National Institute of Technology Jaipur

Associate’s Degree at Swami Harsewanand Public School

High School at Happy Home English School

Suraj T.

Software Engineer

Gurugram, Haryana, India9 yrs 6 mos experience

AI EnabledAI ML Practitioner

Key Highlights

Engineered solutions processing over 10 million events daily.
Achieved a 40% boost in data processing efficiency.
Spearheaded migration of 8,000 databases to Snowflake.

Stackforce AI infers this person is a Data Engineer specializing in Cloud Computing and Data Analytics across various industries.

Contact

Skills

Core Skills

Data EngineeringCloud ComputingData AnalyticsMachine LearningData Analysis

Other Skills

AWSAmazon Web Services (AWS)AnalyticsApache AirflowApache DagsterApache KafkaApache SparkApache Spark StreamingAzure Cosmos DBAzure Data FactoryAzure Data LakeAzure DatabricksAzure DevOpsAzure Event HubAzure Functions

About

Experience

9 yrs 6 mos

Total Experience

2 yrs 1 mo

Average Tenure

1 yr

Current Experience

Emids

Lead Data Engineer

Jun 2025 – Present · 1 yr · Noida, Uttar Pradesh, India · Hybrid

PythonPySparkAzure DatabricksSQLApache KafkaData Engineering+1

Yieldmo

Data Engineer

Jan 2024 – Mar 2025 · 1 yr 2 mos · India · Remote

● Designed and developed scalable data pipelines using Airflow, Snowflake, and AWS, ensuring efficient data processing for ad exchange analytics.
● Collaborated with multiple third-party vendors to generate and deliver customized datasets via S3 and email, streamlining data sharing for external partners.
● Led the integration of Liveramp data into internal pipelines, enriching customer insights and improving decision-making for targeted ad placements.
● Optimized SQL queries and Snowflake workloads, reducing query times by 15% and improving data accessibility for analytics and reporting.
● Successfully backfilled a month's worth of critical client data within a day, ensuring zero downtime and data consistency.
● Implemented complex transformations in data pipelines, adding key fields to enhance reporting accuracy and business intelligence.
● Participated in a hackathon to optimize click-through rates (CTR) using Spark ML, demonstrating a potential uplift in CTR with predictive modeling.
● Provided production support, resolving 95% of critical data pipeline incidents promptly to ensure uninterrupted operations.

Python (Programming Language)MySQLSnowflakeApache AirflowPySparkData Engineering+3

Upwork

Cloud Data Engineer Consultant

May 2023 – Nov 2023 · 6 mos · Remote

● Led the migration of disparate agricultural data from multiple sources into Trino, ensuring seamless integration and accessibility. Designed and implemented an optimal partitioning strategy, improving query performance and reducing processing time by 30% for large tables.
● Developed a robust Dagster-based orchestration pipeline to automate ETL workflows, improving data reliability and reducing manual intervention by 40%. Implemented monitoring and alerting mechanisms to enhance pipeline stability.
● Leveraged Python, SQL, and dbt to optimize data transformations and schema design, reducing data processing costs by 25%.

Amazon Web Services (AWS)Apache Dagsterdata build tool (dbt)SQLPythonData Engineering+1

Microsoft

Data Engineer

Apr 2022 – Dec 2023 · 1 yr 8 mos · Noida, Uttar Pradesh, India

● Engineered an event streaming solution using PySpark on Synapse, processing 10 million+ events per day from Azure Event Hub to Delta Lake, achieving a 40% increase in data processing efficiency.
● Innovated a data quality solution with LangChain and OpenAI, automating the generation of PySpark code from English prompts, resulting in a 60% reduction in data validation time.
● Designed and managed data pipelines in Azure Data Factory (ADF) to orchestrate data movement and transformations across cloud storage, Synapse, and Delta Lake.
● Orchestrated 15+ CI/CD pipelines via Azure DevOps, ensuring seamless deployment across development, testing, and production environments.
● Drove the creation and provisioning of essential Azure resources including Synapse workspaces, Key Vaults, and ADLS using ARM templates, contributing to a 60% faster project setup.
● Led the implementation of comprehensive logging practices using Azure Log Analytics, significantly improving resolution speed, resulting in a 25% decrease in incident response time.

Continuous Integration and Continuous Delivery (CI/CD)DevOpsAzure Cosmos DBPySparkCloud Supply Chain and ProvisioningDatabases+19

Genpact

Data Engineer

Apr 2021 – Apr 2022 · 1 yr · Bengaluru, Karnataka, India

● Spearheaded the migration of semi-structured JSON data to Snowflake, leveraging Python, SQL, and AWS S3, ensuring efficient storage and retrieval.
● Automated ETL processes and email notifications with Dataiku triggers and AWS services, resulting in a 30% reduction in processing time.
● Managed the successful migration of 8,000 databases from SQL Server to Snowflake, optimizing schema design and query performance.
● Optimized processing speed using PySpark and Snowflake by 80%.

Dataiku DSSAmazon Web Services (AWS)PySparkDatabasesAnalyticsData Ingestion+10

Ineuron.ai

Machine Learning Intern

Jan 2021 – Mar 2021 · 2 mos

Anomaly Detection :
Anamoly detection in the logs of the wireless transmission and idetifying the KPIs.
Technology : Python, Scikit Learn, MongoDB, Flask, Beautiful Soup

PythonScikit LearnMongoDBFlaskBeautiful SoupMachine Learning+1

Amdocs

Data Engineer

Aug 2016 – Apr 2021 · 4 yrs 8 mos · Gurgaon, India

● Developed Telecom business offerings, processing and analyzing data using Python, SQL, and Pandas.
● Reduced daily mapping process runtime from 20 to 2 hours using ETL optimization and automation, leading to a 90% efficiency improvement.
● Managed Ordering, Billing, and Migration activities for a user base of 100K+, leveraging SQL, Data Warehousing, and CRM systems.
● Successfully handled software defects using Agile methodologies and Jira, resolving an average of 15 issues per sprint.

Data AnalysisDatabasesAnalyticsCustomer Relationship Management (CRM)Data IngestionSQL+4