Naveen Kumar

DevOps Engineer

Bengaluru, Karnataka, India2 yrs 9 mos experience

Key Highlights

Expert in proactive cloud monitoring and incident response.
Strong collaboration with DevOps and SRE teams.
Focused on reliability engineering and operational excellence.

Stackforce AI infers this person is a Cloud Infrastructure Engineer specializing in SaaS environments with a focus on reliability and monitoring.

Contact

Skills

Core Skills

Cloud MonitoringIncident ManagementAmazon Web Services (aws)

Other Skills

Production SupportRoot Cause AnalysisInfrastructure MonitoringAWSAzureGCPTechnical TroubleshootingKibanaApplication MonitoringAmazon CloudWatchLoad BalancingGrafanaVirtual Private CloudAuto ScalingAlert triage

About

I am a Cloud Monitoring Engineer with experience supporting production and customer-facing cloud environments in fast-paced, SLA-driven settings. I specialize in proactive monitoring, alert triage, incident response, and root cause analysis to ensure high availability and performance across cloud-based systems. In my current role at Flexera, I monitor hybrid cloud infrastructure and production services, analyzing metrics, logs, and system behavior to detect issues before they impact customers. I collaborate closely with DevOps, SRE, and Product teams to investigate incidents, validate fixes, and continuously improve monitoring accuracy and operational reliability. With a strong foundation in production support and troubleshooting, I bring a structured approach to problem-solving — correlating logs, metrics, and application behavior to identify underlying causes. I have hands-on exposure to AWS environments, monitoring tools, and core infrastructure components such as compute, networking, and scaling systems. I am actively deepening my expertise in cloud architecture, observability, automation, and reliability engineering, with a long-term goal of growing into advanced cloud infrastructure and solution-focused roles. I take pride in ownership, clear documentation, and staying calm under pressure — reliability is not reactive, it is built intentionally. I thrive in collaborative, high-velocity environments where continuous learning and operational excellence drive impact. 📩 Feel free to connect with me here on LinkedIn.

Experience

2 yrs 9 mos

Total Experience

Average Tenure

Current Experience

Flexera

Cloud Monitoring Engineer

Feb 2026 – Present · 4 mos · Bengaluru

At Flexera, I am part of the Cloud Monitoring (NOC) team, responsible for ensuring the availability, reliability, and performance of customer environments across AWS, Azure, GCP, and Kubernetes-based platforms.
My role focuses on proactive monitoring, incident response, and operational support for production systems. I work extensively with logs, metrics, and alerts to identify issues, perform root cause analysis, and ensure minimal impact to customer environments.
I primarily support production alert management, cloud infrastructure monitoring, and IT asset management (ITAM)-driven environments, ensuring systems remain stable, optimized, and compliant.
I collaborate closely with SRE, DevOps, and Product teams to improve monitoring effectiveness, reduce alert noise, and enhance system reliability across distributed cloud systems.
Key Responsibilities:
Monitoring production environments across AWS, Azure, GCP, and Kubernetes clusters.
Managing and investigating production alerts, ensuring timely response and resolution within SLA.
Performing root cause analysis using logs, metrics, and monitoring dashboards.
Supporting ITAM-driven environments, ensuring visibility, compliance, and efficient resource utilization.
Handling incident workflows through ticketing systems and coordinating with cross-functional teams.
Troubleshooting infrastructure and application issues across multi-cloud and containerized environments.
Analyzing system behavior to identify patterns and prevent recurring incidents.
Contributing to improvements in monitoring processes, alerting strategies, and documentation.
Impact & Learning:
This role is helping me build strong expertise in:
Multi-cloud monitoring across AWS, Azure, and GCP.
Kubernetes and containerized workload observability.
Incident management and production support.
Reliability engineering principles in distributed systems.
Log-driven debugging and performance analysis.