Rudhra Rai

DevOps Engineer

Bengaluru, Karnataka, India9 yrs 4 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Led a DevOps team to enhance infrastructure solutions.
  • Designed scalable AWS EKS architectures reducing onboarding time.
  • Achieved significant cost savings through strategic migrations.
Stackforce AI infers this person is a DevOps Engineer with extensive experience in SaaS infrastructure and cloud architecture.

Contact

Skills

Core Skills

DevopsSolution ArchitectureCloud ArchitectureRelease ManagementConfiguration ManagementCost Management

Other Skills

AI OpsAWSAWS Elastic MapReduceAnalytical SkillsAnsibleCC++CI/CDCost ReductionDockerElasticSearchElasticsearchGCPHTMLHadoop Yarn

About

learning everyday and improving.

Experience

9 yrs 4 mos
Total Experience
3 yrs
Average Tenure
3 yrs 11 mos
Current Experience

Nexla

3 roles

DevOps Tech Lead

Jul 2025Present · 11 mos · India · Remote

Tech Lead Manager, Devops

Promoted

Aug 2024Jul 2025 · 11 mos · India · Remote

  • Spearheaded a DevOps team while supporting pre-sales and post-sales Infrastructure solutions for Nexla customers
  • Initiated AI Ops support, implementing tools like KubeRay to enhance Python/AI workload efficiency, and help them develop solutions 2x faster.
  • Established and streamlined processes for cost management, release management, and team operations.
  • Designed a scalable AWS EKS cloud architectures for Private installations, integrating ArgoCD, and IAAS solutions, reducing customer onboarding time from weeks to hours.
KubernetesAWSPythonCost ReductionSolution ArchitectureDevOps

Senior devops engineer

Jul 2022Sep 2024 · 2 yrs 2 mos · India · Remote

  • Architecture:
  • Designed cross-cloud, cross-region, and multi-account private link architectures.
  • Developed AWS EKS and EMR architectures tailored for Nexla applications.
  • Product Enhancements:
  • Established private connectivity solutions (VPN, VPC, and direct private links) to streamline customer onboarding.
  • Optimized Kafka PVC resizing, multi-database CDC integration, and monitoring infrastructure.
  • Tested and implemented the External Secrets Kubernetes Operator for secure secret management.
  • Product Management:
  • Led private install implementations on EKS, AKS, GKE for key clients (on AWS, GCP and Azure), showcasing technical expertise and people management skills.
  • These product enhancement solutions and archictures help onboarding new customers faster adding in about $600K ARR in revenues.
  • Tool Implementation:
  • Introduced and implemented tools like Cloudbeaver, RedisInsight, Kafka-UI, OpenCost, and Vault SSH, enabling secure, efficient, and cost-effective operations.
  • Enhanced monitoring with Prometheus (Mimir for long-term storage) and multi-cloud cost dashboards.
  • These tools and processes improved the developer productivity, on time alerts, observability tools for critical services, managed process to grant access of databases and servers without waiting for accesses.
  • AWS:
  • Set up CDC RDS infrastructure, health notifications for outages, and WAF ACL for rate limiting.
  • Migrated databases from MariaDB to Aurora using DMS; implemented cost-saving measures like ECR pull-through cache.
  • Achieved FTE Partner approval by enabling AWS Security Hub.
  • GCP:
  • Proactively optimized costs by adjusting GKE node pools and structured Terraform codebases.
  • Enabled Slack notifications for GKE and AWS health events for SaaS and infra environments.
  • Azure:
  • Productionized Nexla app in Azure Kubernetes Service (AKS), including Terraform infrastructure setup.
  • Implemented Azure workload identity for authentication and Google SSO integration.
AWSGCPKubernetesPrometheusKafkaTerraform+2

Directi

DevOps Engineer II

Sep 2019Jun 2022 · 2 yrs 9 mos · Bangalore · On-site

  • Monitoring:
  • 1. Lead the architecture design and implementation of monitoring, alerting, and metric collection infra for 300+ boxes and 70+ apps from scratch, reducing the cost by 2000$ using the Prometheus stack.
  • High Availability:
  • 1. Guided the team to reduce the manual scaling operational time by 70% using AWS autoscaling groups for apps handling the traffic of 50000+ live users.
  • 2. Solved the problem of a single point of failure and load distribution in the ELK stack by adding Logstash servers.
  • Release Management:
  • 1. Created Jenkins CI/CD pipelines, for building rpm packages and docker images, in turn publishing them to rpm and ECR repositories and further deploying them on respective servers and ECS clusters.
  • 2. Migrated apps to Ecs clusters for easy release management and autoscaling.
  • 3. Did POC on Jenkins serverless using Amazon Elastic Container Service (ECS) / Fargate plugin, for reducing the load on Jenkins master servers.
  • Configuration management:
  • 1. Launched salt-api in salt-stack for smooth integration with other automations.
  • 2. Written various salt-state files for change management and deployment of apps across 300+ boxes.
  • Data integration and analysis:
  • 1. Created a tool in python for 100k+ domains that analyzes the retention period of our customers, in turn pitching in better offers, or getting better feedback from customers.
  • Migrations and Benchmarking:
  • 1. MySQL Zero downtime database migration using master-master replication technique from amazon rds to amazon aurora
  • 2. Benchmarked MySQL RDS vs MySQL aurora, in terms of pricing, scaling, upgradations, maintenance, and operation handling
  • Other roles and responsibilities:
  • 1. Write parsers, backup, and metric collection scripts
  • 2. Write RCAs for issues and downtimes.
  • 3. Access management, AWS resource provisioning, Jenkins job creations, issue debugging
  • 4. Contributed to launching infrastructure with Terraform for infrastructure as a coding initiative with the team.
PrometheusAWSJenkinsPythonAnsibleDevOps+1

Flock

DevOps Engineer II

Sep 2019Jun 2022 · 2 yrs 9 mos · Bangalore

Innovaccer

Devops Engineer

Jan 2017Sep 2019 · 2 yrs 8 mos · Noida Area, India

  • Configuration Management
  • 1. Designing configuration management for all the third party services and in house developed applications
  • Cost Cutting:
  • 1. Migrating Elasticsearch on spot instances using a third party service Spotinst in turn saving $30,000 per month.
  • 2. Migrated YARN compute loads to AWS EMR by using AWS spot instances and Auto Scaling techniques in turn saving the cost of $40,000 per month.
  • 3. Migrated QA and development servers from AWS EC2 to On Premise Data Centers.
  • AWS technologies:
  • 1. Improved AWS account authentication process by introducing assume role for cross account management.
  • 2. Used Other Services like EC2, S3, IAM, EMR, ELB, Spot Instances, CloudWatch, Route 53, VPC Peering, VPN Tunnelling.
  • Monitoring:
  • 1. Architected and established infrastructure monitoring using Prometheus, exporters and alert manager with the team.
  • 2. Created my own backup and monitoring system written in python.
  • Deployment:
  • 1. Created jenkins jobs for deployments in production.
  • 2. Written shell and ansible scripts for installation of services.
  • 3. Production deployments using jenkins and ansible.
  • Production Optimisation:
  • 1. Optimising Elasticsearch clusters by monitoring it and resolved issues coming in production.
  • 2. Issue resolutions related to Spark jobs and Yarn.
  • 3. CentOS / Ubuntu system administration and performance tuning.
AWSElasticsearchPythonAnsibleConfiguration ManagementCost Management

Education

Vellore Institute of Technology

Bachelor of Technology (BTech) — Computer Science

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Devops & Solution Architecture

Explore similar profiles based on matching skills and experience