Dilip Rathod

SRE (Site Reliability Engineer)

India8 yrs 9 mos experience
Most Likely To Switch

Key Highlights

  • Over 7 years of experience in DevOps and cloud infrastructure.
  • Expert in Kubernetes management and optimization.
  • Proven track record of cost-saving initiatives in cloud infrastructure.
Stackforce AI infers this person is a SaaS Infrastructure Engineer with a strong focus on Kubernetes and cloud optimization.

Contact

Skills

Core Skills

Kubernetes ManagementInfrastructure AutomationCloud Cost OptimizationDeployment AutomationInfrastructure Management

Other Skills

AWS Command Line Interface (CLI)Amazon CloudWatchAmazon EC2Amazon EKSAmazon Web Services (AWS)AnsibleAuto Scaling GroupsAutomationBashCC++Capacity PlanningChefCloudWatchContainerization

About

I am a dedicated Senior Site Reliability Engineer with over 7 years of experience in the DevOps and cloud infrastructure domain. My expertise lies in building, automating, and optimizing systems to ensure performance, scalability, and cost efficiency. Throughout my career, I’ve had the opportunity to work across various industries, helping companies implement robust infrastructure solutions, streamline operations, and enhance observability. With extensive hands-on experience in Kubernetes, Docker, AWS, Azure, and GCP, I have led numerous projects centered around Kubernetes management and optimization, including migrating services to Kubernetes, automating scaling, and enhancing system monitoring. I am passionate about leveraging modern tools like Terraform and KEDA, alongside monitoring solutions such as Prometheus and Datadog, to create reliable and efficient systems. Key focus areas in my work include: Kubernetes Management: Streamlining and managing Kubernetes clusters to improve scalability, performance, and reliability across environments. Kubernetes Automation: Automating deployments, scaling, and service management using Kubernetes-native tools like Helm, KEDA, and AWS Load Balancer Controller. Kubernetes Monitoring and Observability: Implementing robust monitoring solutions like Prometheus and Datadog to ensure the health and performance of Kubernetes workloads. Kubernetes Cost Optimization: Using tools like Kubecost and other cost management strategies to optimize resource usage and reduce cloud spending. I enjoy solving complex technical challenges in Kubernetes environments and continuously seek opportunities to innovate and optimize cloud infrastructure. Let’s connect and explore how we can collaborate to build more resilient, cost-effective, and scalable Kubernetes-driven solutions.

Experience

8 yrs 9 mos
Total Experience
1 yr 9 mos
Average Tenure
3 yrs 1 mo
Current Experience

Clari

Sr. Site Reliability Engineer

May 2023Present · 3 yrs 1 mo · Bengaluru, Karnataka, India · Remote

  • Led Kubernetes stack management and optimization.
  • Spearheaded infrastructure cost-saving initiatives.
  • Enhanced observability and monitoring with Datadog and Prometheus.
  • Set up a logging stack with CloudWatch to send Kubernetes logs for centralized logging
  • and monitoring.
  • Automated environment setup using Terraform and Terragrunt to streamline
  • infrastructure provisioning and management.
  • Implemented ArgoCD for continuous deployment (CD) to Kubernetes, automating the
  • deployment process and ensuring seamless updates across environments.
KubernetesTerraformDatadogPrometheusCloudWatchKubernetes Management+1

Joveo

Lead Devops Engineer

Nov 2020Apr 2023 · 2 yrs 5 mos · Hyderabad, Telangana, India

  • Managed and optimized Kubernetes infrastructure, ensuring seamless operations and system reliability.
  • Implemented cloud cost optimization strategies, significantly reducing expenses while enhancing operational efficiency.
  • Implemented cloud cost optimization strategies, significantly reducing expenses while enhancing operational efficiency.
  • Improved system reliability by automating processes and deploying advanced monitoring tools for proactive issue resolution.
  • Successfully migrated over 60 services to Kubernetes, ensuring smooth transitions with minimal downtime.
  • Led the migration from Datadog to Grafana Cloud, streamlining observability and monitoring efforts.
  • Achieved a 50% reduction in COGS through strategic cost optimization initiatives.
KubernetesCloud Cost OptimizationMonitoring ToolsDatadogGrafanaKubernetes Management

Dream11

SD2- Devops

Apr 2020Nov 2020 · 7 mos · Mumbai, Maharashtra, India · Hybrid

  • Implemented one-click deployment for multiple Auto Scaling Groups (ASG) at Dream11.
  • Automated routing based on service stack numbers to streamline deployment processes.
  • Enhanced deployment logic to enable auto-scaling of service stacks based on user input.
Auto Scaling GroupsDeployment Automation

Qubole

2 roles

MTS-Devops

Promoted

Aug 2019Apr 2020 · 8 mos

  • Managed logging infrastructure and ensured high service availability at Qubole in Bengaluru, India.
  • Implemented application monitoring using SignalFx and automated infrastructure processes with Terraform.
  • Optimized cloud infrastructure with Cloud Custodian to enhance performance and efficiency.
Logging InfrastructureMonitoring ToolsTerraformInfrastructure Management

Site Reliability Engineer

Jul 2018Jul 2019 · 1 yr

Inmobi

Site Reliability Engineer

Jun 2017Jun 2018 · 1 yr · Bengaluru Area, India

  • Developed a web-based terminal for Mesos containers for developer debugging.
  • Implemented log analysis and monitoring systems to detect suspicious network activity.
  • Contributed to service deployment, onboarding, and monitoring.
Web-based Terminal DevelopmentLog AnalysisInfrastructure Management

Education

MIT Academy of Engineering

Bachelor’s Degree — Computer Engineering

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Kubernetes Management & Infrastructure Automation

Explore similar profiles based on matching skills and experience