Dhiraj Kadam

SRE (Site Reliability Engineer)

Toronto, Ontario, Canada8 yrs 5 mos experience

Key Highlights

  • Reduced server provisioning time by 40% with Kubernetes and Docker.
  • Achieved annual cost savings of $250,000 through operational efficiency.
  • Designed highly available architectures resulting in 99.9% uptime.
Stackforce AI infers this person is a Site Reliability Engineer with expertise in cloud infrastructure and DevOps practices.

Contact

Skills

Core Skills

DevopsKubernetesSite Reliability EngineeringAwsGcp

Other Skills

AWS CloudFormationAWS LambdaAgile MethodologiesAmazon Web Services (AWS)AnsibleCC++CICDCSSChefCloud ComputingCloud-Native ApplicationsConfiguration ManagementContinuous DeliveryContinuous Integration

About

๐Ÿš€ Lead Site Reliability Engineer | 10+ Years of Expertise I'm a dedicated Lead Site Reliability Engineer with over 8 years of hands-on experience in driving operational excellence and optimizing system performance. My mission is to harness the power of technology to elevate businesses to new heights. ๐ŸŒŸ Key Achievements: Containerization Efficiency: Led Kubernetes and Docker implementation, reducing server provisioning time by 40% and improving resource utilization by 30%. Achieved annual cost savings of $250,000. Observability: Led Observability initiative that reduced downtime by 30% by implementing SLO, SLI and Error budget further improving Mean Time To Recover (MTTR) and Mean Time Between Failure (MTBF) High Availability Architectures: Designed and implemented highly available architectures on AWS and GCP, resulting in 99.9% application uptime. Automated Deployment: Introduced Jenkins pipelines, cutting software deployment time by 50%. Enabled faster feature releases. ๐Ÿค– Automation Evangelist: I've developed custom Python scripts and tools, making complex tasks feel like a breeze and eliminating bottlenecks. ๐ŸŒ Infrastructure as Code Pro: With Terraform, I've mastered the art of managing infrastructure as code, ensuring reproducibility and scalability across deployments. ๐Ÿ’ผ Collaboration Maven: I'm not just a techie. I excel at fostering strong cross-functional team relationships and enabling seamless communication. Because I believe that exceptional results are born from effective teamwork. I'm here to add value, drive operational efficiency, and make a positive impact. Let's connect and explore opportunities to work together! ๐Ÿ“ง dhirajkadam4191@gmail.com Competencies: Kubernetes, Opentelemetry, Prometheus, Victoria Metrics, Grafana, ELK, Docker, AWS, GCP, Python, Jenkins, Spinnaker, Terraform, Linux, Ansible, Git, DevOps, CICD.

Experience

8 yrs 5 mos
Total Experience
1 yr 2 mos
Average Tenure
--
Current Experience

Nue.io

Site Reliability Engineer

Nov 2024 โ€“ Present ยท 1 yr 7 mos ยท Toronto, Ontario, Canada ยท Remote

  • Reinvent your revenue lifecycle! Nue is an easy-to-manage, omni-channel RevOps platform designed to meet the needs of the modern business. With Nue, RevOps teams can accelerate sales with innovative pricing models and streamlined sales processes from quote to order to renewal โ€“ all the while delivering accurate analytics to Finance.
  • Learn more at http://nue.io
DevOpsKubernetesAmazon Web Services (AWS)GitlabElastic Stack (ELK)Terraform+4

Virtasant

Site Reliability Engineer

Jan 2024 โ€“ Nov 2024 ยท 10 mos ยท Toronto, Ontario, Canada ยท Remote

DevOpsKubernetesElastic Stack (ELK)GrafanaPrometheus.ioPython (Programming Language)

Career break

Relocation

Oct 2023 โ€“ Jan 2024 ยท 3 mos ยท Toronto, Ontario, Canada

  • Relocated to Toronto, Canada

Asapp

Lead Site Reliability Engineer | AWS, Kubernetes, Python, Terraform, Docker and Linux

Jan 2023 โ€“ Sep 2023 ยท 8 mos ยท Hybrid

  • Spearheaded a high-performing team of SREs, fostering a culture of ownership, accountability, and continuous learning
  • Implemented Observability pillars (Monitoring, Logging, Tracing) using technologies like Prometheus, Grafana, ELK stack, Opentelemetry, etc. Improved system stability, reduced operational overhead, and enhanced service performance.
  • Identified gaps in existing processes and implemented changes to improve the overall resilience and high availability of systems like kubernetes (EKS), Virtual Machines (EC2), Databases (RDS) on AWS cloud
  • Defined and monitored SLOs to measure system reliability to achieve defined performance metrics
  • Enhanced system reliability by optimizing SLAs to reduce response time and improve performance
  • Collaborated with the team to implement Agile methodologies, resulting in enhanced efficiency and increased productivity
  • Streamlined upgrades (Kubernetes, DB, Redis) on AWS with minimal downtime, using Terraform, GitLab, and Helm
  • Implemented Python scripts to interact with tools like Kuberhealthy to perform synthetic monitoring
Cloud-Native ApplicationsAWS LambdaGitOpsPythonKubernetesAmazon Web Services (AWS)+14

Sharechat

DevOps Engineer II | GCP, Kubernetes, Jenkins, Terraform, Docker and Python

Jun 2021 โ€“ Dec 2022 ยท 1 yr 6 mos

  • Oversea and led a DevOps team, providing mentorship, guidance, and support to improve the overall efficiency
  • Managed 25,000+ Kubernetes nodes running 300+ microservices, optimizing workloads and boosting operational efficiency.
  • Formulated cost-cutting strategies, reducing monthly expenses by 30%, yielding USD 3M in annual savings
  • Designed and developed migration utility in Python to migrate 9 Petabytes of data from AWS S3 to GCP GCS
  • Conducted comprehensive analysis and troubleshooting of infrastructure components including Compute, Networking (DNS, TCP/IP, Loadbalancing), Storage and Operating systems on GCP
  • Initiated and maintained DevOps automation, enhancing developer experience through IAC, Configuration Management, and Release Management.
Cloud-Native ApplicationsDevOpsPythonKubernetesElastic Stack (ELK)Production Systems+16

Goldman sachs

Associate | AWS, Linux, Jenkins, Terraform, Docker and Python

Jan 2020 โ€“ Jun 2021 ยท 1 yr 5 mos ยท Remote

  • Guided the deployment, troubleshooting and maintenance of data science applications like Apache Airflow, Snowflake, AWS EMR etc to minimize operational efforts
  • Introduced CI/CD pipelines, ensuring efficient and reliable application deployment with reduced failures
  • Enforced Terraform Cloud adoption, automating AWS resource provisioning for enhanced operational efficiency
  • Integrated APIs and developed custom tools in Python to facilitate seamless communication between different systems
  • Mentored DevOps engineer fostering a culture of ownership and continuous learning
DevOpsAWS LambdaPythonAmazon Web Services (AWS)Elastic Stack (ELK)AWS CloudFormation+11

Epam systems

Senior DevOps Engineer | AWS, Kubernetes, Jenkins, Terraform, Docker, Linux and Python

Jul 2019 โ€“ Jan 2020 ยท 6 mos

  • Delivered Dockerized Java, Golang and Javascript applications
  • Led cross-functional collaboration to identify and address bottlenecks in the build and release pipeline resulting in a 30% increase in deployment efficiency and accelerated time-to-market for new features
  • Streamlined the coordination efforts for troubleshooting and resolving scalability and deployment challenges, resulting in a 50% decrease in production incidents and enhancing system stability
PythonAmazon Web Services (AWS)Elastic Stack (ELK)Production SystemsTerraformProblem Solving+8

Flux7 inc.

DevOps Engineer | AWS, Kubernetes, Jenkins, Terraform, Docker, Linux and Python

Dec 2017 โ€“ Jun 2019 ยท 1 yr 6 mos ยท Remote

  • Prepared requirement gathering, and performed functional and detailed design analysis
  • Engineered and implemented a highly scalable and fault-tolerant distributed infrastructure on the AWS Cloud using services like IAM, VPC, EC2, ECS, EKS, RDS, ELB, S3 etc
  • Leveraged Python boto3 library to integrate multiple AWS services like EC2, Lambda, API Gateway, DynamoDB, SNS, etc
  • Simplified AWS deployment with optimized CloudFormation templates, reducing manual efforts
  • Conducted demo sessions, presentations, and KT to improve user proficiency.
DevOpsAWS LambdaKubernetesAmazon Web Services (AWS)Elastic Stack (ELK)AWS CloudFormation+10

Digite, inc.

2 roles

DevOps Engineer | AWS, Kubernetes, Jenkins, Terraform, Docker, Linux and Python

Promoted

Jan 2016 โ€“ Dec 2017 ยท 1 yr 11 mos

  • Design and develop a continuous deployment pipeline, integrating Test-Kitchen, Vagrant, Git, Jenkins and Chef across geographically separated hosting zones in AWS
  • Automated developer machine setup using Vagrant / VirtualBox, git and chef
  • Migrated code from SVN to Git along with a new branching and merging strategy to improve product quality and faster release cycle
  • Created a new in-house utility such as check style, code quality, connection leak and find bug to enhance the quality of code and performance
  • Led effort to in-house all repositories and implement a scalable data center to cloud and back solution
  • Partnered with teams to determine application requirement specifications
DevOpsAWS LambdaPythonAmazon Web Services (AWS)Elastic Stack (ELK)Production Systems+7

Software Engineer

Apr 2015 โ€“ Dec 2015 ยท 8 mos

Production SystemsMavenProblem Solving

Quality kiosk

Solution engineer

Aug 2014 โ€“ Mar 2015 ยท 7 mos ยท On-site

  • Implemented application performance monitoring solutions like Nagios, Dynatrace, Splunk etc for 300+ applications for a banking domain
  • Collaborated with different teams to gather requirements regarding monitoring their applications
  • Participated in on call rotation to remediate any system failure
Production Systems

Education

University of Mumbai

Bachelor of Engineering - BE โ€” Computer Science

Jan 2009 โ€“ Jan 2014

Stackforce found 100+ more professionals with Devops & Kubernetes

Explore similar profiles based on matching skills and experience