MANOJ DAS — DevOps Engineer

Seasoned Senior Data Engineer with over 12 years of experience designing, architecting, and delivering scalable, cloud-native data platforms and analytics solutions. Proven expertise across AWS, GCP, Azure,Snowflake, Databricks, BigQuery, and modern data engineering tools like dbt, Apache Spark, and Airflow. I have worked in remote-first team since 2019,collaborating across multiple time zones using tools like slack,microsoft teams,Github and Jira. I specialize in building enterprise-grade data lakes and warehouses, developing optimized ELT/ETL pipelines, and implementing advanced data modeling techniques including Data Vault 2.0. I’ve led cloud modernization efforts, migrated legacy on-prem systems to modern cloud infrastructures, and automated mission-critical data workflows handling 10+ TB/day across diverse domains like Retail , Finance , and Insurance . Key achievements include: ✅ Architected Snowflake-based solutions with dbt and Data Vault 2.0 for scalable forecasting pipelines ✅ Designed BigQuery analytics stack with dbt for card member insights . ✅ Automated AWS Glue pipelines, improving data availability and reducing processing time by 35% ✅ Hands-on leadership in Spark optimization, Airflow orchestration, and real-time streaming using Kinesis ✅ Leadership & Collaboration: Experience leading cross-functional teams and mentoring junior engineers. Skilled at communicating complex technical concepts to both technical and non-technical stakeholders. ✅ Continuous Learning: Passionate about staying ahead of industry trends, continuously improving knowledge of cloud, big data, and data governance to drive innovation and solve complex challenges. I’m passionate about enabling data-driven decision-making, mentoring high-performing engineering teams, and driving innovation through automation and best practices. Let’s connect if you’re working on cloud data platforms, real-time analytics, or modern data architecture!

Stackforce AI infers this person is a Senior Data Engineer specializing in cloud data platforms and scalable analytics solutions.

Location: Bengaluru, Karnataka, India

Experience: 13 yrs 5 mos

Skills

Data Engineering
Cloud Computing
Software Development

Career Highlights

Over 12 years of experience in data engineering.
Expertise in cloud-native data platforms and analytics solutions.
Proven track record in optimizing ETL data pipelines.

Work Experience

Impetus

Lead Data Engineer (3 yrs 3 mos)

Taylor & Francis Group

Senior Data Engineer (1 yr 3 mos)

Softcrylic

Senior Data Engineer (3 yrs 3 mos)

Cognizant

Senior Data Engineer (2 yrs 6 mos)

Tata Consultancy Services

Software Developer (3 yrs 2 mos)

Education

Master of Computer Applications (MCA) at ITER,Bhubaneswar,Odisha

Bachelor of Science - BSc at Ravenshaw University

Schooling at SASS,Taladanda,Odisha

MANOJ DAS

DevOps Engineer

Bengaluru, Karnataka, India13 yrs 5 mos experience

Highly Stable

Key Highlights

Over 12 years of experience in data engineering.
Expertise in cloud-native data platforms and analytics solutions.
Proven track record in optimizing ETL data pipelines.

Stackforce AI infers this person is a Senior Data Engineer specializing in cloud data platforms and scalable analytics solutions.

Contact

Skills

Core Skills

Data EngineeringCloud ComputingSoftware Development

Other Skills

Data Build Tool (DBT)Apache AirflowDatabricksAWS GlueSQLData ModelingAmazon Web Services (AWS)PySparkRedshiftSparkScalaHiveMavenHibernateJava

About

Experience

13 yrs 5 mos

Total Experience

2 yrs 8 mos

Average Tenure

Current Experience

Impetus

Lead Data Engineer

Jan 2022 – Apr 2025 · 3 yrs 3 mos · Bengaluru, Karnataka, India · Hybrid

Build and optimize data pipelines, resulting in a 15% improvement in data processing efficiency and reduced latency by 20%.
Developed and deployed the promotion forecast module with 100% test coverage, improving forecast accuracy by 25% and reducing system downtime during high-traffic periods.
Designed a data warehouse to manage customer demand data, implementing duplicate checks and date range validation, reducing data inconsistencies by 30% and enabling faster report generation by 40%.
Built an optimal and scalable data pipeline processing 10TB of data daily, reducing pipeline run times by 35%, by collaborating with stakeholders and technical leaders to ensure alignment with business requirements.
Architected enterprise-level ETL workflows integrating data from SQL, No-SQL, and Big Data technologies, improving data integration efficiency by 20% and enabling real-time analytics across the business.
Collaborated with the architecture engineering team to implement quality solutions, ensuring a 100% adherence to engineering best practices, resulting in a 15% reduction in error rates and faster project turnaround times.
Optimised Databricks Spark applications to process 5TB of data in under 5 minutes, improving processing speed by 40% and modularising the application to support continuous integration and deployment, ensuring full test coverage and system stability.
Created the architectural design for the Redshift export utility by using Glue and spark. Automate the process by using Shell script, Mongo DB, Lambda and Step function with error handling and proper auditing for fail-safe process.
Extensive experience with Data-bricks, optimising large-scale data pipelines and workflows to enhance performance and cost efficiency. Developed high-performance Spark jobs within Databricks to handle vast volumes of data efficiently.

Data Build Tool (DBT)Apache AirflowDatabricksAWS GlueSQLData Engineering+1

Taylor & francis group

Senior Data Engineer

Sep 2020 – Dec 2021 · 1 yr 3 mos · Bengaluru, Karnataka, India · Remote

Led a team of 5 engineers to automate data pipelines using AWS Glue and Redshift, reducing data processing times by 30% and enabling real-time data ingestion across multiple data sources.
Designed and implemented data warehouses and optimized data workflows, improving query performance by 25% and reducing storage costs by 15% through better data compression and partitioning strategies.
Utilized AWS Glue to automate data extraction from S3 to Redshift, reducing manual ETL efforts by 50% and improving data validation accuracy through automated checks, ensuring 99% data integrity in production systems.
Developed and maintained ETL pipelines for structured and semi-structured data across cloud (AWS) and on-premise databases, streamlining data flows and reducing pipeline failures by 20% through robust error handling and monitoring mechanisms.
Created data models and optimized Redshift clusters, reducing query run times by 30% and achieving a 20% cost reduction by right-sizing instances and employing optimized partitioning and indexing strategies.
Collaborated with business teams to define data requirements and designed ETL workflows that improved reporting accuracy by 20% and reduced report generation times from hours to minutes, meeting critical business objectives efficiently.

Amazon Web Services (AWS)PySparkRedshiftData EngineeringCloud Computing

Softcrylic

Senior Data Engineer

May 2017 – Aug 2020 · 3 yrs 3 mos · Chennai, Tamil Nadu, India · On-site

Developed ETL pipelines using Spark (SQL and Core) in Scala, Hive, and Unix, processing over 10TB of data daily, improving pipeline efficiency by 30% and reducing manual interventions through automation, which directly contributed to a 40% increase in online sales for the Ad-tech division.
Implemented a pluggable and reusable ETL framework with built-in monitoring and error logging, improving development time for new use cases by 20% and increasing operational visibility across all ETL pipelines.
Developed Avro utilities for dynamic schema generation of daily refreshed tables, reducing development time by 20% and improving data consistency across multiple reporting systems.
Implemented performance optimizations in Spark applications, including tuning memory executors and cores, reducing job completion times by 35% and improving overall system resource utilization.
15+ Data Scientists from 4 agile autonomous teams got the data that they need to forecast sales and demand.
Developed a pipeline for efficient and intelligent storing of archival data in google cloud and Amazon s3.
Developed an auditing and orchestration tool in Java to execute Hive and spark scripts in parallel and putting more granular control and error checks for each step. Using Simba JDBC to execute big-queries from the java application.
Making the application as step level and monitoring and logging the status of each step for better job management.

Apache AirflowAmazon Web Services (AWS)Data EngineeringCloud Computing

Cognizant

Senior Data Engineer

Nov 2014 – May 2017 · 2 yrs 6 mos · Chennai Area, India · On-site

Automated ETL processes using Spark and Hive, reducing data wrangling time by 40% and increasing pipeline reliability by implementing robust error handling and job monitoring.
Designed and implemented reusable generic Hadoop systems, cutting development time by 30% and enabling the team to scale data processing across multiple projects, increasing operational efficiency by 20%.
Developed and delivered custom reports for CEC management and QA teams, improving performance management and reducing report generation time by 25%, leading to better decision-making and resource allocation within the team.
Provided in-depth analysis comparing customer and internal evaluations, leading to a 20% improvement in customer service quality by identifying key areas for training and process improvements.
Utilized Sqoop to ingest and retrieve data from RDBMS systems like MSSQL and Oracle, reducing data extraction time by 15% and ensuring seamless integration with the Hadoop ecosystem for large-scale data processing and analytics.
Developed a Spark and Scala application to automate seller information extraction, reducing manual data retrieval by 30% and providing real-time reports in XML format, enhancing the business team's decision-making process.
Tuned Hive and Spark query performance, reducing data retrieval times by 25% and optimizing resource utilization for processing large datasets, leading to a 15% cost reduction in operational expenditures.

PySparkHiveData EngineeringCloud Computing

Tata consultancy services

Software Developer

Sep 2011 – Nov 2014 · 3 yrs 2 mos · Bengaluru, Karnataka, India · On-site

Built with Java, Spring and Hibernate the application provides end-to end support for insurance quote to claim processing.
Creating the operational report for the business team in SQL and xml and involves in bug resolving and enhancement of the previous ops report.
Bug resolving with respect to System testing and enhancement according to client demand.
Creating modules for database and web-page communication.
Bug resolving with respect to System testing and enhancement according to client requirements.
Created and worked in an application on java with JDBC for the advisor team.
Creating the operational report for the business team in SQL and xml and involves in bug resolving and enhancement of the previous ops report.

MavenHibernateSoftware Development