Chaitanya ReddyReddy — DevOps Engineer

Highly skilled Data Engineer with over 14+ years of IT experience, including 10+ years specializing in data engineering. Proven expertise in designing, building, and optimizing scalable, high-performance data architectures across Big Data, cloud, and real-time analytics. Proficient in Big Data frameworks such as Apache Hadoop (HDFS, Hive, MapReduce), Apache Spark (PySpark, Spark Streaming), Apache Flink, and Elasticsearch/OpenSearch, with extensive experience in distributed computing platforms like Databricks, Amazon EMR, Google Dataflow, and Azure Synapse. Expert in programming languages including Python (6+ years), Java (7+ years), and Scala (9+ years), with a strong foundation in ETL/ELT design, data pipeline development, and data modeling (Dimensional Modeling, OLAP/OLTP, Star/Snowflake Schema). Extensive hands-on experience in real-time and batch data processing using Apache Kafka, Spark Streaming, Pulsar, and Amazon Kinesis, with strong data pipeline orchestration skills utilizing Apache Airflow, Prefect, Dagster, Apache NiFi, Azure Data Factory (ADF), and Apache Oozie. Proficient in SQL & NoSQL databases, including MySQL, PostgreSQL, Oracle, SQL Server, Apache HBase, MongoDB, and DynamoDB, with expertise in data warehousing technologies such as Snowflake, BigQuery, Amazon Redshift, and Azure Synapse Analytics. Skilled in Lakehouse architectures with Delta Lake, Apache Iceberg, and Hudi. Strong background in Cloud Data Engineering, working across AWS (EMR, Glue, S3, Lambda, Redshift, Kinesis), Azure (Data Factory, Databricks, Synapse, Microsoft Fabric), and GCP (BigQuery, Dataflow, Pub/Sub), with hands-on experience in Infrastructure as Code (IaC) using Terraform, AWS CloudFormation, and Azure Bicep. Deep knowledge of DataOps & DevOps best practices, including CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI), data governance (GDPR, CCPA, HIPAA compliance), data security, metadata management, and data quality frameworks (Great Expectations, dbt). Experience with containerization and orchestration using Docker, Kubernetes (EKS, AKS, GKE), along with monitoring and logging solutions such as Grafana, Prometheus, Datadog, AWS CloudWatch, ELK Stack, and Splunk. Familiarity with serverless computing & event-driven architectures, leveraging AWS Lambda, Azure Functions, GCP Cloud Functions, and message-driven workflows using Kafka, AWS EventBridge, SNS, and SQS. Knowledgeable in MLOps and AI/ML integration, working with ML pipelines (MLflow, TFX, SageMaker) and Feature Stores (Feast, Databricks Feature Store) for machine learning-powered data solutions.

Stackforce AI infers this person is a Data Engineering expert with extensive experience in Cloud solutions and Big Data architectures.

Experience: 8 yrs 4 mos

Skills

Data Engineering
Cloud Data Engineering
Cloud Migration
Ai/ml Solutions
Full Stack Development
Cloud Solutions

Career Highlights

Over 14 years of IT experience with a focus on data engineering.
Expert in designing scalable data architectures across multiple platforms.
Strong background in Cloud Data Engineering and DataOps best practices.

Work Experience

JLL Technologies

Synechron

Senior Data Engineer & Cloud Migration Specialist – GCP | AWS | Azure | Spark | Microservices (1 yr 5 mos)

Monetary Authority of Singapore (MAS)

Senior Data Engineer & AI/ML Solutions Specialist – Cloudera | AWS | Spark | Java | Microservices (1 yr 4 mos)

IBM

Big Data Engineer & Full Stack Developer – AWS | Spark | Java | Microservices (3 yrs 7 mos)

Autodesk

Data Engineer (1 yr 2 mos)

Etisbew Technology Group, Inc. (A CMMI Level 3 Company)

Software Engineer (3 yrs 4 mos)

Education

Bachelor's degree at University of Madras

Chaitanya ReddyReddy

DevOps Engineer

United States8 yrs 4 mos experience

AI ML PractitionerAI Enabled

Key Highlights

Over 14 years of IT experience with a focus on data engineering.
Expert in designing scalable data architectures across multiple platforms.
Strong background in Cloud Data Engineering and DataOps best practices.

Stackforce AI infers this person is a Data Engineering expert with extensive experience in Cloud solutions and Big Data architectures.

Contact

Skills

Core Skills

Data EngineeringCloud Data EngineeringCloud MigrationAi/ml SolutionsFull Stack DevelopmentCloud Solutions

Other Skills

Azure DatabricksAzure Data FactoryPython (Programming Language)HadoopApache SparkScalaJava DevelopmentData ArchitectsData LoadingClouderaData ArchitecturePostgreSQLMedallion ArchitectureGxP Data Quality & GovernanceAzure Synapase Analytics

About

Experience

8 yrs 4 mos

Total Experience

2 yrs 9 mos

Average Tenure

Current Experience

Jll technologies

Lead Data Engineer & AI/ML Solutions Architect – Azure | AWS | ADB | Spark | MS Fabric | Snowflake

Dec 2022 – Present · 3 yrs 5 mos · Remote

Designed and implemented Azure Databricks Medallion architecture (Bronze–Silver–Gold layers) supporting enterprise analytics and data quality initiatives.
Developed ETL/ELT pipelines using ADF, PySpark, and Delta Lake, integrating structured and unstructured data from SQL Server, APIs, and Lake Gen2.
Collaborated with QA and Compliance teams to implement GxP-aligned data validation, lineage tracking, and audit controls.
Defined data models, dictionaries, and semantic layers for analytical reporting and visualization in Power BI.
Leveraged Purview and Unity Catalog to establish governance, metadata management, and RBAC security.
Optimized Databricks clusters for performance and cost efficiency (30% reduction in compute time).
Optimized Snowflake workloads by leveraging clustering, partitioning, caching, and materialized views, cutting compute costs by 30% while improving dashboard responsiveness.
Enabled observability and monitoring with AWS CloudWatch, Databricks log analytics, and custom Python scripts, reducing incident resolution time by 25%.
Mentored and coached a cross-functional team of 5+ engineers on PySpark performance tuning, Snowflake optimization, and DevOps CI/CD pipelines (GitHub Actions, Terraform, Docker, Kubernetes).
Partnered with data scientists to operationalize ML/GenAI pipelines using SageMaker, MLflow, and Python APIs, integrating LLMs and RAG workflows via OpenAI API, LangChain, and vector databases (Pinecone, Weaviate).

Azure DatabricksAzure Data FactoryPython (Programming Language)HadoopApache SparkScala+11

Synechron

Senior Data Engineer & Cloud Migration Specialist – GCP | AWS | Azure | Spark | Microservices

Jun 2021 – Nov 2022 · 1 yr 5 mos · Hyderabad, Telangana, India · Remote

Finance on Cloud (FOTC) Program Development: Spearheaded the development of core components for HSBC's FOTC program, automating data processing and report generation using Big Data frameworks like Spark, Scala, and Hadoop Ecosystem tools
Improved operational efficiency and enhanced data accessibility for stakeholders through modern ETL and orchestration workflows
Big Data and Spark Optimization: Designed and optimized large-scale data pipelines using Apache Spark with Scala, achieving a 25% increase in data processing efficiency
Utilized HDFS, YARN, and Hive for distributed data storage and query optimization, ensuring faster analytics on structured and semi-structured datasets
ETL Development with AWS Glue: Built and deployed highly efficient ETL pipelines using AWS Glue with Python and Scala
Leveraged Glue's dynamic frames and data frames APIs to enable seamless integration with services like S3, Redshift, RDS, and DynamoDB
Designed visual workflows in Glue Studio, streamlining data extraction, transformation, and loading
Data Lake Creation and Monitoring: Developed and maintained centralized data lakes on AWS Glue to facilitate scalable data storage and processing
Configured monitoring and logging mechanisms using AWS CloudWatch and Glue logs, ensuring quick resolution of errors and proactive job management for enhanced system reliability
Data Workflow Automation with Azure Data Factory: Automated and orchestrated data workflows across cloud platforms using Azure Data Factory (ADF)
Designed robust data pipelines to extract, transform, and load data into Azure Data Lake Storage, Synapse Analytics, and Azure SQL Database, reducing manual efforts and accelerating analytics readiness

Apache SparkBig DataScalaGoogle Cloud Platform (GCP)Amazon Web Services (AWS)Java Development+8

Monetary authority of singapore (mas)

Senior Data Engineer & AI/ML Solutions Specialist – Cloudera | AWS | Spark | Java | Microservices

Jan 2020 – May 2021 · 1 yr 4 mos · Singapore · On-site

Managed and optimized the MAS Enterprise Data Lake (EDL) on Cloudera Hadoop (HDFS, Hive, HBase), improving system throughput by 35% while achieving 100% compliance with MAS regulatory standards and enforcing RBAC and encryption policies.
Designed and implemented distributed data pipelines using Apache Spark (Scala, PySpark) for both batch and real-time processing, resulting in a 40% improvement in structured and semi-structured data transformation efficiency.
Built scalable ETL frameworks on HDFS, Hive, and Apache Oozie, enabling parallelized and compressed ingestion pipelines that reduced ingestion time by 50%.
Enhanced Hive data modeling using advanced partitioning, indexing, and bucketing techniques to support OLAP workloads, cutting query execution latency by 30%.
Automated batch data workflows using Apache Oozie, improving reliability and reducing manual operations by 60%.
Collaborated across business, analytics, and compliance teams to deliver trusted, high-availability datasets for BI and regulatory reporting, accelerating insight delivery timelines by 40%.
Contributed to data infrastructure readiness for future Lakehouse expansion by aligning with Apache Iceberg and Delta Lake adoption paths and introducing data contracts and schema evolution strategies.
Developed STRE (Stress Testing & Risk Engine) applications on Cloudera Hadoop, integrating regulatory models (Basel III/IV, CCR) and reducing end-to-end risk calculation latency by 30% through Spark-based parallel processing.
Engineered Spark ETL pipelines (Scala, PySpark) for processing structured and semi-structured datasets from banking and trade systems, achieving 40% processing efficiency gains and enhancing financial model scalability.
Designed microservice-based ingestion systems using Spring Boot and REST APIs, improving modularity and reducing dependency bottlenecks across compute layers.

Apache SparkBig DataScalaPostgreSQLJava DevelopmentData Architects+5

Ibm

Big Data Engineer & Full Stack Developer – AWS | Spark | Java | Microservices

Apr 2016 – Nov 2019 · 3 yrs 7 mos · Hyderabad Area, India · On-site

Enterprise Hadoop Environment Implementation and Management: Designed and supported a robust enterprise Hadoop ecosystem with expertise in Core Java, Hadoop Distributed File System (HDFS), and cluster administration
Led capacity planning, performance tuning, and high-availability configurations while collaborating with cross-functional teams, including infrastructure, network, database, and BI teams
Applied Microservices architecture principles to enhance integration and ensure seamless performance across data engineering systems
Comprehensive Utilization of Hadoop Ecosystem Components: Extensively worked with a variety of Hadoop ecosystem tools, including Apache Spark, PySpark, YARN, MapReduce, Hive, HBase, Zookeeper, Pig, and HDFS, to enable end-to-end data processing, analytics, and governance
Enhanced processing pipelines by integrating Scala and Python, driving efficiency within big data workflows
AWS Glue Integration for Data Lake Solutions: Designed and implemented serverless ETL pipelines with AWS Glue, integrating Glue Crawlers for automated schema discovery and updates
Enhanced pipeline performance by leveraging AWS Lambda for automation, implementing data quality checks, and enabling schema evolution in a scalable, cost-effective manner
Applied Microservices principles to design modular, reliable data ingestion pipelines
ETL Pipeline Design and Development with AWS Glue: Built and managed efficient ETL pipelines in AWS Glue, extracting, transforming, and loading data from diverse sources into centralized data lakes on S3
Leveraged Glue's features, such as DynamicFrames and DataFrames, to improve data transformation capabilities and analytics readiness, supporting near real-time data availability

Apache SparkBig DataScalaPostgreSQLData LoadingLarge Scale Systems+5

Autodesk

Data Engineer

Sep 2014 – Nov 2015 · 1 yr 2 mos · Bengaluru Area, India · On-site

Administration and Management of Big Data and AWS Environments: Administered and managed projects such as UCP, EWS, and Web Activity with a strong focus on AWS cloud environments
Leveraged advanced knowledge of AWS Redshift, S3, and EC2 to design, manage, and maintain scalable data warehousing solutions
Ensured high performance, data quality, and availability by implementing performance tuning techniques, enabling efficient analytics and reporting
Optimization and Deployment of Hadoop Environments: Proposed and implemented optimized configurations for Hadoop environments, including HDFS, YARN, and MapReduce
Conducted capacity planning and performance optimization for seamless big data processing
Deployed new hardware and software solutions to enhance scalability and system responsiveness, ensuring the efficient handling of growing datasets
User Management and Cluster Maintenance in Hadoop: Managed Hadoop user accounts and permissions, ensuring secure access and efficient data workflows
Administered Hadoop clusters by adding or removing nodes and monitoring cluster health with tools like Ambari, Cloudera Manager, and Core Java for creating custom management solutions
Maintained data integrity and optimized resource utilization for improved performance
CI/CD and Build Operations for Data Engineering Projects: Orchestrated CI/CD pipelines using tools like Jenkins, Git, GitHub Actions, and Bamboo to streamline automated builds, testing, and deployments
Monitored and optimized build processes to ensure efficiency and consistency across data engineering projects
Enhanced collaboration through version control and automated workflows
Data Quality, Governance, and Availability in HDFS: Ensured the availability and quality of data stored in HDFS by implementing stringent data governance practices and conducting regular validations

Apache SparkBig DataJava DevelopmentClouderaData LoadingApache Kafka+3

Etisbew technology group, inc. (a cmmi level 3 company)

Software Engineer

Jan 2010 – May 2013 · 3 yrs 4 mos · Hyderabad, Telangana, India · On-site

Designing web templates including logos, banner ads, buttons, flash interactions, icons, flash intros, JavaScript and flash animated menus, style sheets etc. Structured and developed full web sites including HTML, XHTML, CSS and JavaScript, JQuery coding from raw level to Dream weaver.
CMS Administration, CRM Maintenance, Website Backend support and content management.
SEO On page and off page Optimization. Design and Basic HTML Conversion of Web Pages and Coordinating with the Programmers for further changes and adjustments in designing Web Pages. Coordinating with the Business Development Team in making presentations and creating visuals for the business proposals, SEO On page and off page Optimization. Developing and maintenance of the company website. CMS Administration, CRM Maintenance with content management.