R

Raman Grover

CEO

Palo Alto, California, United States19 yrs 10 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Developed Google's flagship backup system for Ads data.
  • Core contributor to Apache AsterixDB, impacting real-time data management.
  • Designed scalable log analytics framework processing PBs of data daily.
Stackforce AI infers this person is a Big Data and Cloud Infrastructure expert with extensive experience in scalable systems.

Contact

Skills

Core Skills

Data InfrastructureMachine LearningDatabase SystemsCyber SecurityLog AnalyticsDistributed SystemsBig DataReal-time AggregationDistributed AlgorithmsData SamplingData SynchronizationReal-time Billing

Other Skills

Anomaly DetectionAzure CloudBackup and Recovery SystemsBatch ComputationsConflict ResolutionData ProcessingDataflow EngineHadoopMap-ReduceProject ManagementQuery OptimizationReal-time Data IngestionTelecom SystemsTensorFlow

Experience

Oracle

Architect, Conversational AI, ML, Data Infrastructure

Jun 2022Present · 3 yrs 9 mos · North Carolina, United States

Google

Staff Software Engineer (Data Infrastructure/ML)

Mar 2018Jun 2022 · 4 yrs 3 mos · Mountain View, California

  • Built Kairos – Google’s flagship backup & recovery system for Ads data at PB scale
  • Received the Google Perf Award for creating Kairos, a first-of-its-kind backup and recovery system for Napa, Google’s Ads analytical data warehouse. Introduced the concept, designed the architecture, and built the system from scratch to protect one of Google’s largest and most business-critical datasets.
  • Delivered fine-grained, table-level incremental backups and atomic, in-place restores of datasets spanning tens of petabytes, all while maintaining continuous availability and achieving zero Recovery Point Objective (RPO).
  • Kairos remains Google’s most flexible and scalable infrastructure for data protection—setting the standard for recoverability and resilience in large-scale, real-time systems.
  • Contributed to TensorFlow, integrating support for custom datasets to enable scalable ML pipelines over massive internal datasets, accelerating AI-driven product innovation across the company.
Data InfrastructureMachine LearningBackup and Recovery SystemsTensorFlow

Awake networks

Principal Member Of Technical Staff

Jul 2016Mar 2018 · 1 yr 8 mos · Mountain View, California

  • Applying database systems research in the network stack for enhanced cyber security.
  • Designed and implemented a Query Optimized with sophisticated parsing, re-writing for efficient plan generation, for execution over massive quantity of fine event data, generated at the network layer.
  • Designed and implemented scalable o batch computations for aggregation, summarization of network activity, scaling to thousands of devices, generating large quantities of event data, representing network activity.ffline
Database SystemsCyber SecurityQuery OptimizationBatch Computations

Microsoft research

2 roles

Senior Research SDE

Mar 2015Jul 2016 · 1 yr 4 mos · Mountain View, CA

  • Designed and built a scalable (near real time) log analytics framework, that enables collection and processing of logs produced in a distributed environment involving hundreds of thousands of nodes. The scale of data collected, processed and indexed is in the order of 5-7 PBs/day.
  • Designed and built a dataflow engine that enables executing a generic DAG (directic acyclic graph) composesd of vertices and edges. The dataflow
  • engine is capable of processing continuously arriving data, subjecting it to set of operations, possibly in parallel. It exhibits elasticity, scalability and
  • tolerance to soft and hard failures.
Log AnalyticsDataflow EngineDistributed Systems

Research Intern (Search Labs)

Jun 2014Sep 2014 · 3 mos · Mountain View, California

  • Designed and built a framework for real time aggregation and anomaly detection in multi-dimension high-velocity data (e.g. Twitter).
  • The framework deployed in Azure cloud as a service is scalable to ingest a twitter firehose. The framework is unique in being able to process high-velocity data in real time that is optimized for low memory footprint and low latency in analysis of events and performing cube or pivot operations. The work was implemented in C \# (.NET platform).
Real-time AggregationAnomaly DetectionAzure Cloud

@walmartlabs

Visiting Researcher

Jul 2012Dec 2012 · 5 mos · San Francisco Bay Area

  • Designed and developed distributed algorithms for elastic dataflow in Mupd8, a framework for ingesting and processing streams of fast-flowing data. Elasticity allows the runtime to adjust, that is scale in/out in accordance with fluctuating data arrival rates. The work was implemented in Scala.
Distributed AlgorithmsData Processing

The apache software foundation

PMC Member/Committer at Apache AsterixDB

Sep 2010Jan 2016 · 5 yrs 4 mos · San Francisco Bay Area

  • Core contributor and PMC member for Apache AsterixDB, an open-source Big Data platform for semi-structured data, born out of cutting-edge research.
  • Primary contributor for real-time data ingestion support, a core capability that originated from my doctoral research and was later integrated into AsterixDB’s architecture.
  • This work received the Ten-Year Test of Time Award at EDBT 2025, recognizing its enduring impact on real-time big data management.
  • Actively shape the system’s roadmap, review major contributions, and mentor community developers as part of the project leadership.
Big DataReal-time Data IngestionProject Management

Facebook

Engineering Intern: (Data Infrastructure Team)

Jun 2010Sep 2010 · 3 mos · Palo Alto, California

  • Addressed the problem of using Map-Reduce to sample a massive data set in order to produce a fixed-size sample whose contents satisfy a given predicate. The work was instrumental in providing a performance gain by 100x on a typical sampling queries from the workload at Facebook. A research paper describing the work was accepted at ICDE-2012. The work was implemented in Java.
Map-ReduceData Sampling

Adobe

Computer Scientist

Aug 2005Aug 2008 · 3 yrs · Noida, Uttar Pradesh, India

  • Developed a framework for regular synchronization of user data from enterprise LDAP systems with support for conflict resolution.
  • Developed a thread-safe caching framework in a distributed system for the Adobe LiveCycle platform. The work was implemented in Java.
Data SynchronizationConflict ResolutionDistributed Systems

Baypackets

Software Engineer

Jun 2004Jun 2005 · 1 yr · Noida, Uttar Pradesh, India

  • Developed a rating and charging engine to facilitate real-time billing in telecom domain. The work was implemented using C++, Java and database procedural language - PL/SQL.
Real-time BillingTelecom Systems

Education

UC Irvine

Doctor of Philosophy (Ph.D.) — Computer Science (Distributed Systems)

Jan 2010Jan 2014

UC Irvine

M.S — Computer Science

Jan 2009Jan 2010

Indian Institute of Technology, Roorkee

B.Tech — Computer Science & Engineering

Jan 2000Jan 2004

Stackforce found 100+ more professionals with Data Infrastructure & Machine Learning

Explore similar profiles based on matching skills and experience