Kunal Dhole

DevOps Engineer

India9 yrs 7 mos experience

Key Highlights

  • Led migration to modern cloud infrastructure, improving reliability by 30%.
  • Implemented observability strategy, enhancing incident response by 50%.
  • Developed CI/CD pipelines, reducing deployment times significantly.
Stackforce AI infers this person is a Site Reliability Engineer specializing in cloud infrastructure and DevOps practices.

Contact

Skills

Core Skills

Incident ManagementMonitoring & ObservabilityContinuous Integration And Continuous Deployment (ci/cd)Migration PlanningSecurity Best Practices

Other Skills

DevOpsAWSCI/CDCloud ComputingYAMLTechnical RequirementsProblem SolvingAnalytical SkillsOpsgeniePrometheus.ioGrafanaIngresIstioHelm ChartsArgocd

About

As a seasoned Site Reliability Engineer (SRE) with over 9 years of experience, I specialize in designing, implementing, and maintaining robust, scalable, and secure systems. My passion lies in ensuring the reliability and performance of complex software systems, and I thrive on solving intricate challenges that drive business continuity and user satisfaction. Key Strengths: - System Design & Architecture: Expertise in building and scaling high-availability systems using microservices, containers, and cloud-native technologies. - Automation & Scripting: Proficient in automating repetitive tasks and processes using tools like Ansible, Terraform, and custom scripts in Python, Bash, or other languages. - Monitoring & Observability: Skilled in implementing comprehensive monitoring and alerting solutions with Prometheus, Grafana, ELK Stack, and other industry-standard tools to ensure system health and performance. - Incident Management: Adept at leading incident response efforts, conducting post-mortem analyses, and implementing measures to prevent recurrence. - Performance Optimization: Focused on continuous improvement and optimization of system performance, load balancing, and capacity planning to meet business demands. Professional Highlights: - Successfully led the migration of legacy systems to modern cloud infrastructure, resulting in a 30% improvement in system reliability and 10% reduction in operational costs. - Spearheaded the implementation of a company-wide observability strategy, enhancing incident - response times by 50%. - Collaborated with cross-functional teams to develop and deploy CI/CD pipelines, reducing deployment times and improving code quality. - I am deeply committed to fostering a culture of reliability, resilience, and continuous improvement. My approach blends technical expertise with a strategic mindset, always aiming to align technological solutions with business goals. In addition to my technical skills, I am an advocate for knowledge sharing and continuous learning. I regularly contribute to internal documentation, mentor junior engineers, and participate in industry conferences and meetups. Let's connect if you're passionate about building reliable systems or if you’re looking for ways to optimize and scale your infrastructure. Together, we can ensure that technology not only meets but exceeds expectations, driving success and innovation.

Experience

Mashreq

Technology Manager - DevOps

Aug 2024Present · 1 yr 7 mos · Pune District, Maharashtra, India · Remote

John deere

Lead SRE

Apr 2022Jul 2024 · 2 yrs 3 mos · Pune District, Maharashtra, India · On-site

  • Roles and Responsibities:
  • Site Reliability Engineer with 8 years of experience in:
  • ♦ System Availability and Reliability:
  • Create SLO/Error Budget, monitor systemperformance, and perform incident response.
  • ♦ Conducted PRR and Fire Drills:
  • Improve resilience and preparedness.
  • ♦ Incident Management:
  • Conduct PRR (Product Readiness Review), participate in on-call rotation, and conduct post-incident
  • reviews.
  • ♦ Monitoring and Observability:
  • Create meaningful monitoring systems tomaintain proper logging and tracing, which reduced
  • downtime by 25%.
  • ♦ Collaboration and Communication:
  • Improve cross-functional collaboration and follow best knowledge sharing practices.
  • ♦ Infrastructure Management:
  • Manage infrastructure and optimize cloud resources.
  • ♦ Security and Compliance:
  • Implement security best practices, decreasing vulnerability exposure by 40%.
  • ♦ Performance Optimization:
  • Collaborate with QA engineers to ensure 99.98% uptime for high-traffic applications.
  • ♦ Capacity Planning and Scaling:
  • Plan for capacity management and create scalable solutions.
  • ♦ Cost Optimization:
  • Analyze and optimize costs related to infrastructure and services.
DevOpsAWSIncident ManagementMonitoring & Observability

Harman international

Senior Product Engineer

Jan 2020Apr 2022 · 2 yrs 3 mos · Bangalore · Hybrid

  • ♦ Continuous Integration and Continuous Deployment (CI/CD): Develop, implement, and manage CI/CD pipelines to automate the build, test, and deployment processes.
  • ♦ Infrastructure as Code (IaC): Use tools like Terraform, Ansible, or CloudFormation to manage and provision infrastructure through code.
  • ♦ Monitoring and Logging: Use monitoring tools like Grafana and Prometheus, and perform log analysis with the ELK stack.
  • ♦ Containerization and Orchestration: Use containerization technologies like Docker and Kubernetes to deploy, scale, and manage containerized applications.
  • ♦ Coordinated system patches rollout for 5,000+ users with zero downtime. Developed scripts for automated security monitoring, reducing issue detection time by 50%.
  • ♦ Streamlined change management processes, improving release schedules by 20%.
  • ♦ Implemented monitoring solutions with Prometheus and Grafana for proactive incident management.
  • ♦ Facilitated compliance activities, decreasing vulnerability risk.
DevOpsAWSContinuous Integration and Continuous Deployment (CI/CD)Monitoring & Observability

Wipro

Cloud Engineer

Jul 2016Jan 2020 · 3 yrs 6 mos · Bengaluru, Karnataka, India · On-site

  • ♦ Migration Planning: Plan and execute the migration of existing on-premises applications and services
  • to the cloud.
  • ♦ Disaster Recovery and Backup: Implement DR and backup strategy to protect against data loss.
  • ♦ Security Best Practices: Implement security measures to protect cloud infrastructure and data,
  • including encryption, IAM policies, and network security.
  • ♦ Performance and Resource utilization monitoring Managed cloud infrastructure on AWS.
  • ♦ Provided on-call support for production applications.
  • ♦ Handled incident tracking and workflow management with JIRA and ServiceNow.
  • ♦ Administered Linux and application support.
  • ♦ Customized Docker images and maintained CI/CD pipelines with Jenkins. Validated system readiness
  • through performance test environments.
AWSCloud ComputingMigration PlanningSecurity Best Practices

Education

GOVERNMENT COLLEGE OF ENGINEERING

B.Tech — Electronics and Telecommunication Engineering

Jan 2012Jan 2016

Stackforce found 100+ more professionals with Incident Management & Monitoring & Observability

Explore similar profiles based on matching skills and experience