Varun Singh

DevOps Engineer

Hyderabad, Telangana, India6 yrs 7 mos experience
Highly StableAI ML Practitioner

Key Highlights

  • Expert in AI Infrastructure Networking and Cloud Automation.
  • Proven track record in designing scalable data center networks.
  • Strong background in troubleshooting and optimizing network performance.
Stackforce AI infers this person is a Cloud Infrastructure Engineer specializing in AI-driven networking solutions.

Contact

Skills

Core Skills

Infrastructure As Code (iac)Network AutomationAi Infrastructure NetworkingData Center NetworkingNetwork Troubleshooting

Other Skills

Network Automation (Python | pyATS | Ansible)AI & Cloud Infrastructure Networking (NVIDIA DGX Cloud | Cisco ACI)AI & Cloud Infrastructure NetworkingPythonAnsibleCisco NexusRouting ProtocolsSonicVXLANBGP EVPNTraffic EngineeringInfrastructure as Code (IaC) & ObservabilityCisco Application Centric Infrastructure (ACI)Virtual Extensible LAN (VXLAN)Optimizing Performance

About

I’m a Network Infrastructure Engineer with over 5 years of experience designing, implementing, and optimizing enterprise-grade data center networks now contributing to NVIDIA’s DGX Cloud, where high-performance computing meets next-generation network engineering. Prior to NVIDIA, I worked as a Network Consultant at Cisco Systems, architecting and deploying large-scale Cisco Nexus and Application Centric Infrastructure (ACI) solutions across global enterprise environments. From greenfield builds to complex multi-pod and multi-site fabrics, I’ve delivered highly available, secure, and scalable data center networks supporting critical workloads and digital transformation initiatives. At NVIDIA, I’m expanding my expertise into AI and GPU-driven cloud networking, focusing on high-throughput, low-latency interconnects, automation, and observability within DGX Cloud environments. My current work blends network design, automation, and performance engineering to support demanding AI and HPC workloads at global scale. I’m passionate about bridging traditional data center engineering with modern cloud-native and automation-driven paradigms — using Python and Ansible to bring Infrastructure-as-Code principles to production networking. Core Skills: • NVIDIA DGX Cloud | AI Infrastructure Networking • Cisco ACI (Multi-Pod, Multi-Site) • Cisco Nexus 2K–9K | NX-OS • VXLAN, BGP, OSPF, L2/L3 Design • Network Automation (Python, Ansible) • Infrastructure as Code (IaC) • Data Center Architecture & Scalability • Network Troubleshooting & Optimization • Client Consulting | Solution Delivery | Documentation

Experience

6 yrs 7 mos
Total Experience
2 yrs 2 mos
Average Tenure
1 mo
Current Experience

Microsoft

Cloud Network Engineer II

May 2026Present · 1 mo · Hyderabad · On-site

Infrastructure as code (IaC)Network Automation (Python | pyATS | Ansible)Infrastructure as Code (IaC)Network Automation

Nvidia

Senior network infrastructure engineer

Oct 2025May 2026 · 7 mos · Bengaluru · On-site

  • Building and maintaining high-performance DGX Cloud network fabrics connecting AI/HPC clusters across global data centers.
  • Designing and optimising low-latency, high-bandwidth interconnects supporting distributed GPU workloads.
  • Implementing automation frameworks (Python, pyATS, Ansible) to validate, monitor, and provision network infrastructure.
  • Collaborating with software, compute, and storage teams to improve infrastructure reliability, telemetry, and observability.
  • Working on Infrastructure-as-Code initiatives to standardise configurations and accelerate deployment across DGX Cloud environments.
  • Enhancing network visibility through telemetry, sFlow, and real-time monitoring pipelines.
AI & Cloud Infrastructure Networking (NVIDIA DGX Cloud | Cisco ACI)Network Automation (Python | pyATS | Ansible)AI Infrastructure NetworkingNetwork Automation

Cisco

Senior Network Consultant

Dec 2020Oct 2025 · 4 yrs 10 mos · Bengaluru, Karnataka, India · On-site

  • Specialized in troubleshooting complex data center network issues involving Cisco ACI and the Nexus portfolio (2K–9K), ensuring minimal downtime and high availability.
  • Performed in-depth packet-level analysis using tools like Wireshark and tcpdump to identify root causes of L2/L3 connectivity issues, latency spikes, packet loss, and asymmetric routing.
  • Deals in capturing and interpreting traffic flows to analyze behaviours such as ARP failures, TCP retransmissions, and dropped packets across fabric paths.
  • Diagnosing critical issues involving protocols such as VXLAN, BGP, OSPF, STP, and multicast within multi-tenant ACI environments.
  • Leveraging Python scripting and Ansible to automate config validation, troubleshooting routines, and post-incident remediation workflows.
  • Collaborating with client-side engineers and internal escalation teams during major incident calls to resolve high-impact outages and performance degradations.
  • Documenting findings, packet traces, and mitigation steps for each incident to support knowledge sharing and RCA reports.
  • Delivering proactive support by monitoring fabric health, tracking fault messages, and advising clients on network hardening and operational best practices.
  • Maintaining deep knowledge of Cisco NX-OS, ACI constructs (EPGs, contracts, BD, VRFs), and data center design principles to guide accurate diagnosis and resolution.
Cisco NexusRouting ProtocolsData Center NetworkingNetwork Troubleshooting

Wipro limited

Project Engineer

Oct 2019Nov 2020 · 1 yr 1 mo · Chennai, Tamil Nadu, India

  • Monitored and maintained the network infrastructure of a large telecom client based in Indonesia.
  • Performed troubleshooting for production and non-production issues including routing errors, packet drops, and switch failures.
  • Participated in a cross-functional rotation to gain hands-on experience with routing protocols including OSPF, BGP, and MPLS.
  • Served as a point of contact for configuration changes, firmware upgrades, and root cause analysis.
  • Raised and managed tickets via JIRA and coordinated with Cisco TAC for high-severity incidents.

Education

AJAY KUMAR GARG ENGINEERING COLLEGE, GHAZIABAD

Bachelor of Technology — Computer Science

Jan 2015Jan 2019

Stackforce found 100+ more professionals with Infrastructure As Code (iac) & Network Automation

Explore similar profiles based on matching skills and experience