Abhigyan Agarwal

Software Engineer

India5 yrs 2 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Expert in architecting robust solutions at scale
  • Recognized for resolving critical production issues
  • Passionate about AI/ML Infra and foundational systems
Stackforce AI infers this person is a Backend-heavy Infrastructure Engineer with expertise in large-scale systems.

Contact

Skills

Core Skills

Distributed SystemsSoftware Development

Other Skills

Adobe IllustratorAdobe PhotoshopAlgorithmsC (Programming Language)C++Code DesignCode ReviewData Structures and AlgorithmsJavaKeyboardistMicroservicesMySQLNetwork InfrastructureNetwork telemetryProblem Solving

About

​I am a problem solver at heart, currently working as a Software Engineer in Google's Technical Infrastructure. With a foundation in competitive programming and a passion for foundational, systemic challenges, I enjoy architecting robust solutions for complex problems at massive scale. ​At Google, I've spent three years tackling challenges within the large-scale network telemetry space. This experience has given me a deep, hands-on understanding of what it takes to build and maintain reliable, high-performance systems that operate at planet scale. ​What truly drives me is the satisfying struggle of a difficult problem. My approach is rooted in first principles thinking, thoroughly understanding the fundamental truths of a system. This has enabled me to resolve some of our team's most challenging system-wide production issues, an effort that has been recognized with multiple Spot and Peer awards. ​While I enjoy the reactive thrill of debugging, my passion lies in applying that knowledge proactively to the design and foundational stages. I enjoy anticipating corner cases and architecting resilient systems from the ground up to prevent problems before they happen. ​Currently, my interests are centered on AI/ML Infra, R&D, and foundational systems engineering. I'm actively deepening my knowledge in these areas through self-study and personal research, and am excited to build the next generation of innovative, high-impact products. I'm always open to connecting with people who are passionate about building the future of technology.

Experience

Google

Software Engineer

Jul 2022Present · 3 yrs 8 mos · Bengaluru, Karnataka, India · On-site

  • Worked on core feature development, reliability, and scaling for Google's large-scale network telemetry systems (NetSLO, gTIB) that maintain Google's production network health across XX million machines, processing XX billion probes per second for real-time alerting, SLO reporting and automated issue diagnosis.
  • Key Contributions:
  • ● Resolved Systemic Data Consistency Issues & Optimized Performance
  • › Conducted a full-stack root cause analysis, diagnosing multiple interacting failure modes (client-side cache invalidation, server restarts, DB eventual consistency).
  • › Designed and implemented a multi-layered solution that addressed the root causes by implementing a 'single source of truth' (SOT) abstraction layer, fixing client-side caching, and adding server-side cache pre-warming.
  • › Impact: Eliminated a class of bugs (200+ prod issues), yielding major gains in stability & performance:
  • ✓ Reduced p99 latency by >75% (>2 min to <30s).
  • ✓ Decreased p90 latency by >83%.
  • ● Production Reliability & Incident Response
  • › Enhanced service reliability by leading large-scale incident response, authoring post-mortems & resolving critical data accuracy bugs.
  • › Improved stability by building automated monitoring to detect & alert on pipeline bottlenecks before service impact.
  • ● Optimized High-Throughput Data Pipeline
  • › Engineered a solution to efficiently build large protobuf messages in a high-throughput pipeline, ensuring strict size compliance without performance degradation.
  • ● Accelerated Development Velocity
  • › Designed & implemented an integration test framework, enabling robust E2E validation and empowering the team to ship features faster & with higher confidence.
  • ● Expanded Network Monitoring Coverage
  • › Drove the expansion to critical Edge networks by proposing a major redesign of the resource scheduling.
  • › Additionally, extended monitoring of the Control Plane network increasing overall network monitoring coverage.
C++Distributed SystemsNetwork telemetryProblem SolvingSoftware DesignMicroservices+2

Amazon

Software Development Engineer Intern

May 2021Jul 2021 · 2 mos · Hyderabad, Telangana, India

Cybros - lnmiit

Software Engineering Coordinator

Sep 2020Mar 2022 · 1 yr 6 mos

Education

The LNM Institute of Information Technology

Bachelor of Technology - BTech — Computer and Communication Engineering

Jan 2018Jan 2022

Shiv Jyoti Sr Sec School

CBSE — PCM

Jan 2015Jan 2017

Holy child school

ICSE

Jan 2005Jan 2015

Stackforce found 100+ more professionals with Distributed Systems & Software Development

Explore similar profiles based on matching skills and experience